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Get the Sustainable 
Development Goals 
back ontrack 


Most of the goals will be missed. Here’s how to 
put them back on the right path. 


n 2015, world leaders met in New York at a landmark 

conference of the United Nations. Their aim: to end 

poverty, stop environmental destruction and boost 

well-being. In the world of multilateral diplomacy, 

such meetings are not uncommon, but they tend to 
focus on individual areas, such as climate change or food 
security. The 2015 summit was different because heads 
of state and governments pledged concrete action across 
an integrated set of economic, environmental and social 
issues. They signed up to the Sustainable Development 
Goals (SDGs), a package of 17 goals and associated targets 
for ending hunger, eliminating extreme poverty, reducing 
inequality, tackling climate change and halting the loss of 
biodiversity and ecosystems — all by 2030. 

With that deadline now a decade away, the world is set 
to miss most of the SDGs. Just two of them — eliminating 
preventable deaths among newborns and under-fives, 
and getting children into primary schools — are closest 
among all the goals to being achieved. By contrast, the 
goal to eliminate extreme poverty will not be met because 
some 430 million people are expected still to be living in 
such conditions in 2030. 

Targets to end hunger and to protect climate and bio- 
diversity are completely off track. Whereas some of the 
richer countries are making a degree of progress in the 
SDGs overall, two-thirds of poorer ones are not expected 
to meet those that relate even to their most basic needs. 

The SDGs are extremely valuable, and five years is 
too short a time to see real progress towards economic 
transformation, which must happen if the goals are to be 
achieved in full. But at the same time, the SDGs have hada 
considerable positive impact — including in research and 
higher education. Institutions globally are signing up to 
supporting the SDGs, and staff and students are taking 
on responsibilities, from eliminating single-use plastic, 
to switching to renewable energy. The goals’ cross-cutting 
nature has fuelled research, too, providing scientists with 
opportunities in the fields of the environment, engineer- 
ing, health policy, development economics and beyond. 

But these bright spots cannot mask what is still a bleak 
trend. The UN secretary-general, Antonio Guterres, puts 
the halting progress down toalack of funding — especially 
from the governments of developed countries. The goals 
come witha price tag of between US$5 trillion and $7 tril- 
lion per year, and the shortfall has been put at $2.5 trillion. 

Butthere’s alarger obstacle. The goals are stilla voluntary 
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effort, although monitoring of progress is extensive. A 
UN-affiliated organization called the Sustainable Develop- 
ment Solutions Network produces an annual report that 
shows how well countries are performing onthe SDGs, and 
on page 74 of this issue, researchers from the United States 
and China describe how progress can be more accurately 
recorded (Z. Xu et al. Nature 577, 74-78; 2020) (see also 
page 8). But it’s not compulsory for countries to report 
how they are doing. 

To be achieved, the SDGs need to become mandatory — 
not necessarily in the legal sense, but in the sense that 
nations have to know that there’s no alternative but to make 
them happen. One analogy is the way in which countries 
report their economic data. There’s no international law 
that says every country must report data, such as on con- 
sumer spending, that go into calculating its gross domestic 
product (GDP). But for more than SO years, these data have 
been collected at a granular level and are now reported 
every quarter by national statistics offices. Every agency 
of government understands that a nation’s economy must 
always be seen to be growing, and so the data underlying 
the GDP must also always be increasing. That’s why there’s 
amassive national effort to make sure that everyone works 
towards what could be called the ‘GDP goals’. The SDGs are 
unlikely to be achieved unless they, too, sit at the apex of a 
similar national effort. 

At the same time — and as is often pointed out — some 
GDP goals are in opposition to sustainability efforts such 
as the SDGs. Take new sources of fossil-fuel energy. They 
provide much-needed power for communities lacking 
basic needs and contribute positively to economic growth. 
But they also have a negative impact on the environment 
and on human health. Yet it’s only the positive economic 
impact that counts in official data, and that is one 
reason — although not the only one, by far — why it’s proving 
so difficult to shift power to renewable-energy platforms. 
One solution might be to factor the cost of degrading the 
environment into national accounting — although there is 
as yet little consensus on how this would be done. 


Tighter focus 


One research-led effort where there is more consensus 
is the Global Sustainable Development Report (GSDR). 
Due to be published every four years, it is commissioned 
by the UN secretary-general and written by a team of 
15 authors nominated by UN member states, but working 
independently with the wider scientific community. The first 
report was published last September, and the UN will appoint 
authors for the second one, due in 2023, later this month. 
The first report’s authors are aware that the SDGs lacka 
mandatory reporting mechanism, and that insome cases 
the goals are competing with GDP goals. And they have 
come up withan innovative solution. They recommend that 
nations consider redistributing the 17 SDGs into 6 ‘entry 
points’. These are: human well-being (including eliminating 
poverty and improving health and education); sustainable 
economies (including reducing inequality); access to food 
and nutrition; access to — and decarbonizing — energy; 
urban development; and the global commons (combining 
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biodiversity and climate change). 

This is asensible recommendation. A focus onasmaller, 
more integrated set of goals could help to reduce instances 
in which implementing one of the SDGs has the potential 
to hinder another. Take the case of wind energy. This has a 
part to play in meeting the climate action SDG, but if wind 
farms are sited inthe wrong places, or if the turbines are the 
wrong height, they can potentially harm bird populations, 
which would affect the SDG on protecting biodiversity 
and ecosystems. Under the GSDR proposals, climate and 
biodiversity would sit under one category for action. If 
properly implemented, this would mean that decisions 
on new energy sources would need to consider the impli- 
cations for biodiversity — reducing the numbers of wind 
power plants that end up ininappropriate locations. 

So how could the GSDR’s recommendations be imple- 
mented? So far, it’s not clear that they have reached the 
ministries of finance and economics, and the central 
banks, where they need to be heard. Last month, Guterres 
appointed the departing Bank of England governor Mark 
Carney as UN climate envoy. That is a positive move 
because Carney’s office has the potential to expand the 
report’s footprint by creating a formal link between the 
GSDR team and economic policymakers. 

As the 15 scientists tasked with preparing the next report 
take their posts, they must also urge Guterres to give them 
the resources to raise the profile of their work further, so 
that it becomes as well known and influential as the UN 
reports on climate and biodiversity. 

The SDGs were launched in a 2015 UN report called 
Transforming our World. That’s because a world without 
hunger and disease, with meaningful jobs and a clean 
environment, requires transformational change. But, on 
present trends, there are few signs that such change will 
be achieved by 2030. That’s a reason to redouble policy 
efforts guided by evidence. Real change won’t come until 
the research-policy interface is strengthened. Time is 
short, and there’s alot to do when a decade is all we have. 


Index of 
improvement 


A US-Chinese team shows how sustainability 
metrics can be improved. 


owcanacountry tell that it’s making progress 
on sustainability? How can it work out, from 
year to year, whether its environment is 
improving, along with the economy and 
well-being? 

This is incredibly difficult. A successful measure must 
have at least three characteristics: it needs to be based on 
acomprehensive set of reliable data; it must be accessible 
to non-specialists; and it has to be updated regularly and 
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presented so that progress (or lack of it) can be seen easily. 

For decades, researchers and policymakers have been 
searching for a measure that everyone can agree on. But 
most efforts, from the Human Development Index to the 
Genuine Progress Indicator, end up lacking some aspect 
of those three characteristics. 

The need is becoming more urgent now that the inter- 
national community is set on its 2030 deadline to meet 
the United Nations’ 17 Sustainable Development Goals 
(SDGs), which aim to end poverty and hunger, tackle 
climate change and more. 

The UN publishes an annual report that ranks countries 
ontheir progress towards each goal, witha score out of 100. 
It shows how nations are doing relative to each other and 
whether they’re on track to meeting the goals (most are 
not — see page 7). But the report doesn’t record local-level 
data, and inter-year comparisons are hard. 

For example, Denmark — the top-ranked country inthe 
2019 report, with an impressive aggregate score of 85.2 — 
still has some way to go in reaching Goal 14, which measures 
the health of the marine environment (‘life below water’). 
But those who want to know whether Denmark’s score 
has improved over time are forced to comb through PDFs 
of the previous years’ reports, and these include nothing 
comparing different parts of the country. 

But help could be at hand. In Nature this week, ateam led 
by researchers from Michigan State University in East Lansing 
and China Agricultural University in Beijing show how it’s 
possible to use the SDG reporting framework to construct 
anindex that allows progress to be compared across regions 
and over periods of time (Z. Xu etal. Nature 577, 74-78; 2020). 

The team chose China as its case study, and the results 
show that the country’s overall SDG score increased from 
45.5 in 2000 to 55.4 in 2015. Each of its 31 provinces also 
increased its score. Nationally, the trend is in the right 
direction, although the rate of progress so far is not enough 
to meet the 2030 target. Moreover, China’s scores have 
fallen in four goals — life below water, responsible produc- 
tion and consumption, gender equality, and climate action. 

Can such an approach to data gathering be scaled up? 
Yes, but it needs a large literature base to draw on, and 
public authorities must be willing to recognize the value 
of such an effort — and must know how to use it. 

China’s government is aware of the environmental and 
social risks of rapid industrialization, and the country has an 
active community of researchers and policymakers working 
onsustainability measures. The authors of the paper went 
to national datasources suchas the National Bureau of Sta- 
tistics of China, as well as specialized sources that hold data 
on health, energy and population — all of which are acces- 
sible for research. But that is expensive ona global scale. 
Inmany low- and middle-income countries, especially, the 
infrastructure to collect such data still needs to be built. 

This work is a milestone, nonetheless, because it shows 
how it’s possible to measure detailed progress towards 
the SDGs, and to reveal where countries fall short. With 
17 goals and just 10 years in which to achieve them, the 
world needs better measures to see both how far we have 
come, and how far we have to go. 
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A personal take on science and society 


World view 


By Simine Vazire 


Atoasttothe 
error detectors 


Let 2020 be the year in which we value those 
who ensure that science is self-correcting. 


ast month, I got a private Twitter message from 
a postdoc bruised by the clash between science 
as it is and how it should be. He had published a 
commentary in which he pointed out errors ina 
famous researcher's paper. The critique was accu- 
rate, important and measured — aservice to his field. But it 
caused him problems: his adviser told him that publishing the 
criticism had crossed aline, and he should never doit again. 
Scientists are very quick to say that science is self-correct- 
ing, but those who dothe work behind this correction often 
get accused of damaging their field, or worse. Myimpression 
is that many error detectors are early-career researchers who 
stumble on mistakes made by eminentscientists, and naively 
think that they are helping by pointing out those problems 
— but, after doing so, are treated badly by the community. 
Stories of scientists showing unwarranted hostility to 
error detectors are all too common. Yes, criticism, like sci- 
ence, should be done carefully, with due diligence anda 
sharp awareness of personal fallibility. Error detectors need 
to keep conversations focused on concrete facts, and should 
be open to benign explanations for apparent problems. 
Even whencriticism is done well, error detectors are often 
subjected to personal attacks. Junior scientists are accused of 
bullying their seniors. In one case, early-career researchers 
whoshowed that a famous scientist had engaged in extensive 
self-citation and recycled his own publications were accused 
of being vigilantes and mounting a witch hunt. Scientists who 
found flaws in high-profile nutrition research that required 
retractions were accused of cyberbullying and, bizarrely, of 
holding a grudge against school-lunch programmes. And 
those are just a few incidents that became public. 
Researchers are often warned against pointing out errors 
— and sometimes kindness is used as justification. They 
are told to focus on improving their own research, or to 
state only the positive aspects of that done by others. If you 
don’t have anything nice to say, don’t say anything at all. 
There are several problems with these arguments. First, we 
scientists present ourselves as a community of individuals 
committed to scrutinizing each other. Historian of science 
Naomi Oreskes, in urging non-scientists to trust science, 
argues that “scientists have a kind of culture of collective dis- 
trust”. We cannottell people to trust us because we monitor 
each other, and then appeal to kindness to halt that scrutiny. 
Second, when we suggest that those working on error 
detection and correction are being unkind, we are the ones 
being unkind. Imagine that you are a trainee. You feel that 
science values self-correction, and that it’s not about any 
one person’s ego, but the collective motivation to find new 
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knowledge, to check everything thrice or more, to discard 
false hypotheses and soto move ever closer to truth. Thus, 
when you find an error, you trust that it’s okay to point it 
out. And then you find yourself accused of being a destruc- 
tive, sanctimonious second-stringer — all for applying the 
‘scientific values’ that you'd been taught. 

Yes, error detectors can make research less comfortable 
— but that discomfort is healthy. We should feel responsible 
for minimizing errors in our work, and worried that we 
might have missed some. 

Scientific criticism must not be conflated with bullying. It’s 
not fair to victims of actual bullying to use the term so loosely 
and inappropriately. Instead, we need mechanisms to pro- 
tect those who engage in scientific criticism. These mecha- 
nisms would make science fairer and more inclusive. Advisers 
can get away with awful behaviour — bullying, harassment 
and other abuses of power — because their trainees are so 
dependent on them for funding, recommendations and 
other opportunities. Universities need to hold themselves 
and senior faculty members accountable for preventing 
abuse, including intimidation and bullying of error detectors. 

Weshould domoreto make criticism an established part of 
science. Universities need policies that assess inappropriate 
responses to criticism. Responsible research training should 
include sessions on how to assess whether apparent anom- 
alies could be substantive problems, how to communicate 
concerns and howto respond when issues arise. Funders and 
research-evaluation committees should find ways to support 
and recognize all the work that error detection requires. 

Furthermore, journals need to make clearer and firmer 
commitments to self-correction. In my opinion, they havea 
responsibility to share replication attempts for the work that 
they publish, including creating explicit criteria to enable 
publication of high-quality replications. Consider the Social 
Science Replication Project (C.F. Camerer etal. Nature Hum. 
Behav. 2, 637-644; 2018), which focused on systematically 
repeating 21 experiments published in Science and Nature. It 
was an author, not either journal, who said that bothjournals 
had rejected the submission and shared the reasons given 
for doing so. As a former editor-in-chief of Social Psycho- 
logicaland Personality Science, |was shocked at how easy it 
would be to reject or hide criticism of the editorial process. 
There should be greater transparency and other measures 
ofaccountability over editors, senior authors and reviewers. 

It’s time to be kinder to those doing the criticizing, and 
to demand more accountability and humility from those 
in power. Instead of punishing people who flag errors, we 
should scramble to hire them, give them prizes and award 
them grants sothey can keepimproving science. The least we 
can dois providea space for fact-based criticism that is safe 
from intimidation and retaliation. It’s only thanks to error 
detectors that wecan proclaim that science is self-correcting. 
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The world this week 


News in focus 


Ross Mandi Wunungmurra helped to negotiate the return of blood samples to his community. 
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AUSTRALIAN BIOBANK 
REPATRIATES HUNDREDS OF 
INDIGENOUS BLOOD SAMPLES 


The return is part of agroundbreaking approach that could inspire other 
institutions grappling with howto use historical samples ethically in research. 


By Dyani Lewis 


ate last year, the Galiwin’ku community 

of Elcho Island offthe coast of northern 

Australia celebrated the return of more 

than 200 vials of blood that were col- 

lected from their ancestors half a cen- 

tury ago, before modern research principles 

on informed consent existed. Unbeknownst 

to the Galiwin’ku community, the blood vials 

had been in freezers at the Australian National 
University in Canberra ever since. 

Many Indigenous Australian communi- 

ties believe that the remains of their people, 

including blood and hair, must return to their 


ancestral home, or Country, to be at peace. 
Having the vials returned “meant a lot to us’, 
says Ross Mandi Wunungmurra, chair of the 
Yalu Aboriginal Corporation, the community 
organization that helped negotiate the return. 
Mandi is one of several hundred living mem- 
bers of the community whose blood was also 
collected after a typhoid outbreak in 1968. 
Before the samples from deceased people 
were repatriated, their relatives gave permis- 
sion for DNA to be extracted from the blood. 
People whoarestill alive offered fresh samples. 
The genetic information will be stored in the 
biobank of the National Centre for Indige- 
nous Genomics (NCIG), which the Australian 
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National University (ANU) established specif- 
ically to manage its historical samples. 

The return was part of a groundbreaking 
attempt by the NCIG to right the research 
wrongs of the past. It comes against a back- 
drop of global uncertainty about what institu- 
tions should do with such historical samples, 
which might contain genetic or other informa- 
tion that is valuable to science, but which were 
gathered before the establishment of mod- 
ern research principles governing the ethical 
collection and storage of such samples. When 
the Galiwin’ku samples were collected, Aus- 
tralia’s government had only recently recog- 
nized Indigenous people as citizens, and racist 
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attitudes that denied them the same rights as 
white Australians were rife. 

Scientists say that the approach is laudable, 
and could be adopted by other institutions 
with similar legacy collections. But some 
researchers warn that it may be challenging 
to find a data-access policy that satisfies both 
Indigenous communities and the researchers 
who want use the data. 

Governed by a majority-Indigenous board, 
the NCIG has a mandate to approach com- 
munities whose historical samples are in the 
ANU’s store and ask whether the samples 
should be kept for future research, returned 
or destroyed. So far, the team has contacted 
four out of several dozen communities. 

“The basic principle here is we just do what 
the community wants us to do,” says NCIG 
director Simon Easteal. 


Innovative approach 


Researchers say the scale of the NCIG’s endeav- 
our is impressive. Visiting communities, many 
remote, to ask them what to do about historical 
samples is resource-intensive and beyond the 
budget of many institutions, so many just leave 
such samples in their freezers, says Easteal. 
Sometimes researchers will ask communi- 
ties for permission to collect specimens for an 
individual research project, but that doesn’t 
solve the problem of what to do with the 
specimens once that project is over, he adds. 
Negotiations between the Galiwin’ku com- 
munity and the NCIG took two years, and 
involved people from both groups travelling 
between Canberra and Elcho Island many 
times, says Azure Hermes, a Gimuy Walubara 
Yidinji woman from far north Queensland who 
runs community engagement for the centre. 
The centre will attempt to follow the wishes 
of every Indigenous person whose samples 
are inits collection, which includes specimens 
and records from 7,000 Indigenous people. If 
the person from whoma sample was collected 
has died, the centre will consult their relatives. 
Of the roughly 2,000 people from 4 commu- 
nities whom the NCIG has contacted, about 90% 
have given permission for their DNA or the DNA 
of their deceased relatives to be extracted and 
data added to the NCIG biobank, says Hermes. 
“Australia is definitely leading the way with 
legacy samples or orphan samples, and fig- 
uring out how to deal with them,” says Ripan 
Malhi, an anthropologist at the University 
of Illinois at Urbana-Champaign, who has 
worked with Native American communities. 
The NCIG is giving communities control over 
their genomic data, as well as their samples. 
Data in the centre’s biobank will eventually 
be available for other researchers, but partic- 
ipants are able to withdraw consent for their 
DNA to be used in specific projects — or the 
biobank as a whole — at any time using an 
online portal, an approach known as dynamic 
consent. Annual visits to communities provide 
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further opportunities for people to make deci- 
sions about howtheir data are used, and learn 
about research outcomes, says Hermes. 
Dealing with the genomic data appropri- 
ately is just as important as handling the sam- 
ples themselves sensitively, says Maui Hudson, 
a Maori man whois a research ethicist at the 


University of Waikato in New Zealand. 

But he says that the dynamic-consent model 
is at odds with the move towards open datain 
research. Communities “need to be involved 
inthe process of decisions about what appro- 
priate uses look like, and that’s not possible in 
atruly open-data environment”, he says. 


UNITED STATES TO FUND 
GUN-VIOLENCE RESEARCH 
AFTER 20-YEAR FREEZE 


Government spending deal includes 
$25 million for studies of firearms safety. 


By Nidhi Subbaraman 


awmakers in the United States have 
reached an agreement that would fund 
gun-violence research for the first time 
in more than 20 years. 

A wide-ranging spending bill intro- 
duced on 16 December includes US$25 million 
for studies on the issue, split evenly between 
the Centers for Disease Control and Preven- 
tion (CDC) and the National Institutes of 
Health (NIH). President Donald Trump signed 
the bill into law on 20 December, after it was 
approved by the House of Representatives 
and the Senate. 

“It’s a good start,” says Garen Wintemute, 
director of the Violence Prevention Research 
Program at the University of California, Davis, 
who has been studying gun violence for dec- 
ades. “Violence-prevention policy should be 
guided by solid scientific evidence.” 


“We’ve lost several 
generations of researchers 
in this field.” 


“Is it adequate? Absolutely not. But is it 
meaningful and is it important? Absolutely 
yes,” says Mark Rosenberg, president emeritus 
of the non-profit Task Force for Global Health 
in Atlanta, Georgia, and the founding director 
of the CDC’s National Center for Injury Pre- 
vention and Control (NCIPC), also in Atlanta. 

The CDC says that 39,773 people died of 
gun-related injuries in 2017, the last year for 
which it has released a full analysis. 

The federal government stopped funding 
gun-violence research after Congress passed 
a rule called the “Dickey Amendment” in 
1996. It barred the CDC from using funds “to 
advocate or promote gun control”. That was 
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widely interpreted as prohibiting the funding 
of research into gun violence. 

Jay Dickey, the Republican congressman 
from Arkansas who wrote the amendment, 
reversed his position on gun-violence research 
inthe years before his death. “Both of us now 
believe strongly that federal funding for 
research into gun-violence prevention should 
be dramatically increased,’ Dickey wrote in The 
Washington Post in 2015, along with former 
NCIPC chief Rosenberg. 


Slow thaw 


Last year, Congress clarified that the ban on 
federal dollars for “advocacy” or the promo- 
tion of gun control did not extend toa banon 
research. But lawmakers did notimmediately 
set aside money for such research. The new 
law will require that the CDC and NIH direc- 
tors report back to Congress to ensure that 
any grants they award “support ideologically 
and politically unbiased research projects”. 

David Studdert, who studies health law at 
Stanford Law School in California, says that 
the push to fund gun-violence work at the NIH 
and CDC is encouraging, but that meaningful 
research would require sustained support. 

“We’ve lost several generations of research- 
ers inthis field, and it’s going to take a while to 
build that back up,” Studdert says. 

The federal government is best positioned 
to undertake such an immense financial 
commitment, says Andrew Morral, a senior 
behavioural scientist at the RAND Corpora- 
tion in Arlington, Virginia, and director of 
the National Collaborative on Gun Violence 
Research, a philanthropic organization that 
funds research on the topic. “Where are ille- 
gal guns coming from? Are different state 
laws effective? Are the programmes that are 
being developed to counter firearm suicide 
effective? There are so many questions that 
we don’t have answers for.” 
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Flying to conferences creates a large carbon footprint. 


VIRTUAL SCIENCE 
CONFERENCE TRIES 10 
RECREATE SOCIAL BUZZ 


Psychologists assessed interaction at a meeting 
broadcast in 32 nations to cut its carbon footprint. 


By Alison Abbott 


undreds of attendees watched circa- 
dian biologist Paolo Sassone-Corsi 
give his keynote talk at a scientific 
meeting last month. But barely one- 
fifth of them were sitting in the lec- 
ture hallin Munich, Germany. The others were 
viewing from virtual hubs across 18 time zones. 

The five-hour ‘pop-up’ conference on 
18 November was an experiment to test the 
feasibility of making scientific meetings vir- 
tual, ina bid to cut the large carbon footprints 
created by attendees’ air travel. 

Organizers of academic and other interna- 
tional meetings have begun experimenting 
with ways to offset or cut down on carbon 
emissions, but the November meeting of the 
European Biological Rhythms Society (EBRS) is 
one of the first to take asystematic approach to 
retaining a key benefit of conventional meet- 
ings: networking and face-to-face contact. Its 
organizers invited psychologists to evaluate 
whether technology and organizational tech- 
niques can aid interaction and networking, 
for example by enabling seamless discussion 
across different locations, and encouraging 
participants at all sites to hold social events. 

“We are now busy analysing the outcome, 


but at first glance it seems to have been more 
successful than I had dared hope,” says Martha 
Merrow, a circadian biologist at the Ludwig 
Maximilian University (LMU) in Munich, 
who organized the mostly virtual meeting. 
Participants, who joined from 32 countries, 
said there were advantages beyond cutting 
carbon — forinstance, parents who might find 
it difficult to arrange travel could attend. The 
EBRS says it will continue experimenting with 
the approach. 


Virtual movement 


The experiment comes ina year of worldwide 
activism on climate change, and as scientists 
in many fields have started to think about the 
carbon footprints of their globetrotting activi- 
ties. “Our work tends to be dominated by inter- 
national meetings and flights,” says Corinne Le 
Queré, aclimate scientist at the Tyndall Centre 
for Climate Change Research in Norwich, UK. 
“We need to have a plan to reduce emissions 
by carrying out our work differently.” 

In 2015, Le Queré co-authored one of the 
first carbon-reduction strategies created for 
a research institute. It recommended that 
scientists monitor the carbon output of their 
professional activities, avoid travelling to 
meetings unnecessarily and prioritize events 
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with only small carbon footprints. 

Le Queré says that the Tyndall Centre has 
since tested ways to reduce travel, such as 
using video-conferencing, and many meetings 
are trying similar online approaches. 


Fluent discussions 


The EBRS meeting is amore advanced experi- 
ment, says Le Queré, because of the inclusion 
of psychologists. 

For Merrow, who was inspired by the cli- 
mate-strike movement, the pop-up confer- 
ence was a way to test the waters. She chose a 
topic — the influence of the circadian rhythm 
on metabolism — for which there was lots of 
expertise near Munich, where all the talks were 
given. 

Sasonne-Corsi, who is based at the Univer- 
sity of California, Irvine, was in Europe anyway 
when he gave the plenary lecture. Six short 
talks were repeated before and after his speech 
toensure that participants in all the time zones 
could listen to them, whether in the morning 
or late evening. Three of the speakers travelled 
to Munich by train or car, and Merrow bought 
carbon offsets to compensate for the drive. 

Invited speakers were enthusiastic, she says. 
Sassone-Corsi says, “The scientific endeavour 
has become too big — we all travel to too many 
meetings, and I get very tired.” He travels inter- 
continentally around ten times a year. 

The meeting was broadcast to five virtual 
hubs through high-quality, two-way video sys- 
tems at universities in Tel Aviv, Israel; Zurich, 
Switzerland; Boston, Massachusetts; Tokyo; 
and Porto Alegre in Brazil. Another 69 hubs 
were set up for small groups of researchers 
to watch one-way video broadcasts and send 
questions or comments through Twitter. 

“It was possible to have fluent scientific 
discussions,” says Merrow, and some satellite 
groups organized local social events. In total, 
at least 450 people attended the conference 
and nearly 60% joined in through the interac- 
tive hubs on Twitter. About 10% more people 
attended the virtual meeting than went tothe 
EBRS’s annual conference in August in Lyon, 
France. 


Psychological needs 
Merrow invited LMU psychologist Anne 
Frenzel to assess the success of the approach, 
and the two are analysing feedback collected 
at the virtual conference and the Lyon meeting. 
Aside from cutting emissions, participants 
mentioned advantages of the virtual meeting, 
including not losing time and energy to travel, 
and students being able to attend for free. 
Scientists in Brazil and Israel mentioned that it 
released them from the bureaucracy involved 
in booking flights to overseas conferences. 
“This is not only about carbon footprints — it 
also offers a huge opportunity to think inno- 
vatively about howscientific discussions take 
place,” says Merrow. 


Nature | Vol 577 | 2 January 2020 | 13 


FACEBOOK 


Q&A 


Joelle 
Pineau 


Joelle Pineau doesn’t want science’s 
reproducibility crisis to come to artificial 
intelligence (Al). So the machine- 
learning scientist at McGill University 
and Facebook in Montreal, Canada, 

is spearheading a movement to get 

Al researchers to open their methods 
and code to scrutiny. She holds a role 
dedicated to reproducibility on the 
organizing committee for the Conference 
on Neural Information Processing 
Systems (NeurIPS), a major Al meeting. 
At last month's gathering in Vancouver, 
Canada, Pineau told Nature about the 
measures the committee put in place. 


Why are some algorithms irreproducible? 
It’s true that with code, you press start 

and, for the most part, it should do the 
same thing every time. The challenge can 
be trying to reproduce a precise set of 
instructions in machine code from a paper. 
And then there’s the issue that papers 
don’t always give all the detail, or give 
misleading detail. That's a big issue. 


What got you interested in reproducibility? 
| fell into it by accident. My students would 
say ‘I can’t get these results’, or to get the 
results, they had to do things that | thought 
were methodologically wrong. It’s important 
to stop it before it becomes the norm. 


What reproducibility measures were 
enacted at NeurIPS this year? 

We encouraged people to submit their 
code; we're running a reproducibility 
challenge; and we introduced a checklist 
for papers. The checklist asks, for example, 
whether you clearly labelled the type of 
metrics and measures you're using, what 
the details of your model are and how you 
set certain aspects of the model that can 
change the results a lot. 


What has the reception been like? 

Very good. Code submission is one of the 
elements I’m most impressed with. A year 
ago, 50% of accepted NeurIPS papers 
contained a link to code; this year, it’s 75%. 


Interview by Elizabeth Gibney 
This interview has been edited for length 
and clarity. 
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RUSSIA JOINS RACE 
TO MAKE QUANTUM 


DREAMS REAL 


National initiative aims to build a quantum computer 
and develop practical technologies. 


By Quirin Schiermeier 


ussia has launched aneffort to build a 

working quantum computer, ina bid 

to catch up with other countries inthe 

race to develop practical quantum 
technologies. 

The government will inject about 50 billion 
roubles (US$790 million) over the next 5 years 
into basic and applied quantum research at 
leading Russian laboratories, the country’s 
deputy prime minister, Maxim Akimov, 
announced on 6 December. 

“This is a real boost,” says Aleksey Fedorov, 
a quantum physicist at the Russian Quantum 
Center (RQC), a private research facility in 
Skolkovo near Moscow. “If things work out 
as planned, this initiative will be a major step 
towards bringing Russian quantum science to 
aworld-class standard.” 

Quantum computers use elementary 
particles, which can exist in multiple quan- 
tum states at once, to carry out calculations. 
Quantum bits, or qubits, can in theory pro- 
cess information exponentially faster than 
the binary one-zero bits used in classical 
computing. Powerful quantum computers 
could be used to predict the outcomes of 
chemical reactions, search huge databases 


A quantum processor with a 2,048-qubit chip. 
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or factor large numbers, such as those used 
inencryption. 

Quantum technology already receives 
massive governmental support ina number 
of countries, including China, the United States 
and Germany. The European Union’s €1-billion 
(US$1.1-billion) Quantum Flagship programme, 
first announced in 2016, is expected to produce 
technology-demonstration projects, such as 
a quantum processor on a silicon chip, within 
a few years. 

US technology companies are also racing 
to create quantum computers that outper- 
form classical machines in specific tasks. 
Prototypes developed by Google and IBM, for 
example, are becoming as capable as classical 
computers. In October, scientists at Google 
announced that a quantum processor working 
ona specific calculation had achieved sucha 
quantum advantage. Russiais “five to ten years 
behind” other countries, says Fedorov. “But 
there’s a lot of potential here.” 

Poor funding has excluded Russian 
quantum scientists from competing with 
Google, says Ilya Besedin, an engineer at the 
National University of Science and Technology 
in Moscow. The national quantum initiative 
might help to turn this around, he says. 

“No oneis close to the quantum-computing 
capacity that would be required for practical 
applications,” says Besedin. “We're all look- 
ing for new avenues to explore. With serious 
government support, this is going to become 
avery interesting research opportunity.” 


Home-grown qubits 


The initiative comes as quantum science in 
Russia begins to recover from the departure, 
inthe 1990s and 2000s, of top researchers who 
left for better salaries and funding opportu- 
nities. Several Russian quantum physicists 
working abroad are onthe RQC’s international 
advisory board. Others, including Alexey 
Ustinov, a condensed-matter physicist at the 
Karlsruhe Institute of Technology in Germany, 
have received grants from the Russian govern- 
ment to set up research groups in Russia. 

And scientists in Russia are already devel- 
oping their own approaches to building large- 
scale quantum computers, says Ustinov. “The 
initiative is a promising start to increase the 
level of quantum research in Russia,” he says. 
“We will see where this will lead.” 
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NASA/JPL-CALTECH 


NASA's Mars 2020 mission will feature a detachable helicopter drone, and is just one of several missions to the red planet in the coming year. 


THE SCIENCE EVENTS 
TO WATCH FOR IN 2020 


A Mars invasion, a climate meeting and human- 
animal hybrids are set to shape the research agenda. 


Compiled by Davide Castelvecchi 


Mars attack 


2020 will see a veritable Mars invasion as 
several spacecraft, including three landers, 
head to the red planet. NASA will launch its 
Mars 2020 rover, which will stash rock sam- 
ples that will be returned to Earth in a future 
mission and will also feature a small, detach- 
able helicopter drone. China will send its first 
lander to Mars, Huoxing-1, which will deploya 
small rover. A Russian spacecraft will deliver a 
European Space Agency (ESA) rover to the red 
planet — if issues with the landing parachute 
canbe resolved. And the United Arab Emirates 
will send an orbiter, in the first Mars mission 
by an Arab country. 

Closer to home, China is planning to send 
the Chang’e-5 sample-return mission to the 
Moon. And elsewhere in the Solar System, 
Japan’s Hayabusa2 mission is due to return 


samples of the asteroid Ryugu to Earth, and 
NASA’s OSIRIS-REx will bite off a chunk of its 
ownasteroid, Bennu. 


Big sky, big data 

Following the media splash made by its image 
of supermassive black hole at the centre of the 
galaxy Messier 87 in 2019, the Event Horizon 
Telescope collaboration expects to release 
new results, this time about the black hole 
at the Milky Way’s centre. This could include 
multiple images and perhaps even a movie of 
gas swirling around the behemoth, which is 
called Sagittarius A*. 

Later in the year, ESA’s Gaia mission will 
update its 3D map of the Milky Way, which has 
markedly changed howscientists understand 
the Galaxy’s structure and evolution. And grav- 
itational-wave astronomers will unveil the 
troves of cosmic collisions they observed in 
2019 that created ripples in space-time. These 


© 2020 Springer Nature Limited. All rights reserved. 


include many mergers of black holes but also 
previously unseen collisions of a black hole 
and astar. 


Mega-collider dreams 


CERN hopes to secure funding for a future 
mega-collider this year. The European parti- 
cle-physics laboratory near Geneva, Switzer- 
land, will hold a special meeting of its council 
in Budapest in May, where a committee will 
decide onthe plans as part of an update to the 
lab’s European Strategy for Particle Physics. 
CERN’s proposal includes a menu of options 
for a future collider. The lab hopes to build a 
100-kilometre machine that could be up to six 
times as powerful as the Large Hadron Collider 
and cost up to €21 billion (US$23.4 billion). 

In the United States, the Fermi National 
Accelerator Laboratory near Chicago, Illinois, 
should unveil long-awaited results from Muon 
g-2, a high-precision measurement of how 
muons — more-massive siblings of electrons 
— behave in a magnetic field. Physicists hope 
that slight anomalies could reveal previously 
unknown elementary particles. 


Climate homework due 


In August, the United Nations Environment 
Programme will release a major report on 
the scientific and technical aspects of geo- 
engineering — approaches that could be 
used to fight climate change. These include 
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News in focus 


Mosquitoes are being infected with bacteria that prevent them from spreading diseases. 


pulling carbon dioxide out of the atmos- 
phere and blocking sunlight. Also in 2020, the 
International Seabed Authority is due to issue 
long-awaited regulations that will enable 
mining of the bottom of the sea. Scientists 
worry that not enoughis known about how the 
practice could damage marine ecosystems, 
with potentially disastrous impacts on already 
stressed environments. 

But the big event on climate will come in 
November, when the COP26 climate con- 
ference — a moment of truth for the Paris 
agreement — kicks off in Glasgow, UK. Under 
the 2015 accord, countries must come for- 
ward with updated targets for reducing their 
greenhouse-gas emissions to help limit global 
warming to no more than 2 °C. But most coun- 
tries have been slow to act on their promises. 
And the future of the treaty itself hangs in the 
balance: the United States is expected to for- 
mally drop out that month. 


US election climax 


The White House and the US Congress are up 
for grabsin November, and the outcome could 
have big implications for science, in particu- 
lar the climate. A second term in office would 
allow President Donald Trump to continue 
unravelling his predecessor’s climate poli- 
cies — and all but ensure the United States’ 
formal exit from the Paris agreement a day 
after the election. Democrats could stymie 
those efforts by winning the White House or 
gaining a majority in both houses of Congress. 
All 435 seats in the House of Representatives 
and 35 of the Senate’s 100 seats are being 
contested. 


‘Humice’ are coming 


The dream of growing replacement organs 
for humans in other animals could get closer 
as researchers make strides in the ethically 
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fraught technique. Stem-cell scientist Hiro- 
mitsu Nakauchi at the University of Tokyo 
plans to grow tissue made of human cells in 
mouse and rat embryos. He will then trans- 
plant those hybrid embryos into surrogate ani- 
mals, astep that wasn’t allowed until anew law 
inJapan came into effect last March. Nakauchi 
and collaborators have also applied to doa 
similar experiment using pig embryos. The 
ultimate goal of such research is to produce 
animals with organs that can, eventually, be 
transplanted into people. But some research- 
ers think it will be safer and more effective to 
grow ‘organoids’ in the lab. 


Pressure to perform 

Physicists hope to achieve their dream of cre- 
ating a material that conducts electricity with 
noresistance at room temperature — although, 
for now, such superconducting materials work 
only at pressures of millions of kilopascals. 
Following the success of compounds known 


Scientists are brewing up synthetic yeast. 
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as lanthanum ‘superhydrides’, which in 2018 
broke all temperature records for supercon- 
ductivity, researchers hope to synthesize 
yttrium superhydrides that could be super- 
conducting at temperatures of up to 53 °C. 


Mozzie counter-attack 


In the Indonesian city of Yogyakarta, a major 
test of a technique that could halt the spread 
of dengue fever will reach its conclusion. 
Researchers have released mosquitoes 
carrying Wolbachia bacteria — which inhibit 
the replication of mosquito-borne viruses that 
cause dengue, chikungunya and Zika — and let 
the infection spread in the wild population. 
Smaller tests in Indonesia, Vietnam and Brazil 
have shown tantalizing promise. 

Also promising is a malaria vaccine that is 
due to be trialled on Equatorial Guinea’s island 
of Bioko. Andin 2020, the World Health Organ- 
ization hopes to eliminate sleeping sickness, 
or African trypanosomiasis, as a public health 
problem. This notorious disease is carried by 
tsetse flies (Glossina spp.). 


Solid energy 


Companies large and small plan to start sell- 
ing solar cells that use perovskites, promising 
materials that could be cheaper and easier to 
produce than the silicon crystals used in con- 
ventional solar panels. When paired with sili- 
conin ‘tandem’ cells, perovskites could yield 
the most efficient solar panels onthe market. 

The energy sector could achieve another 
milestone during the Tokyo Olympic Games in 
July, when Toyota is expected to unveil the first 
prototype of acar powered by ‘solid-state’ lith- 
ium-ion batteries. These replace the liquid that 
separates the electrodes inside the battery 
with a solid material, increasing the amount 
of energy that can be stored. Solid-electrolyte 
batteries last longer, but they tend to charge 
more slowly. 


Synthetic yeast 


An ambitious effort by synthetic biologists 
to rebuild baker’s yeast (Saccharomyces cer- 
evisiae) is due to reach completion in 2020. 
Researchers have entirely replaced the genetic 
code of much simpler organisms before — for 
example, the bacterium Mycoplasma mycoides 
— but doing this in yeast cells is much more 
challenging because of their complexity. The 
effort, called Synthetic Yeast 2.0, is a collab- 
oration between 15 laboratories on 4 conti- 
nents. Teams have replaced the DNA in each 
of the 16 chromosomes of S. cerevisiae piece- 
meal with synthetic versions. They have also 
experimented with reorganizing and editing 
the genome — or deleting chunks of it — to 
understand how the organism evolved and 
how it copes with mutations. Researchers hope 
that engineered yeast cells will unleash more- 
efficient and -flexible ways to manufacture a 
host of products, from biofuels to medicines. 
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EYE OF SCIENCE/SPL 


THE SUPER-COOL 
MATERIALS THAT 
SEND HEAT 10 SPACE 


Paints, plastics and even wood can be engineered to stay cool 
in direct sunlight — but their role in displacing power-hungry 
air conditioners remains unclear. By XiaoZhi Lim 


hen businessman Howard Bisla 
was tasked with saving a local 
shop from financial ruin, one 
of his first concerns was energy 
efficiency. In June 2018, he 
approached his local electric- 
ity provider in Sacramento, 
California, about upgrading 
the lights. The provider had another idea. It 
offered to install an experimental cooling sys- 
tem: panels that could stay colder than their 
surroundings, even under the blazing hot sun, 
without consuming energy. 

The aluminium-backed panels now sit on 
the shop’s roof, their mirrored surfaces coated 
with a thin cooling film and angled to the sky. 
They cool liquid in pipes underneath that run 
into the shop, and, together with new lights, 
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have reduced electricity bills by around 15%. 
“Even ona hot day, they’re not hot,’ Bisla says. 

The panels emerged from a discovery at 
Stanford University in California. In 2014, 
researchers there announced that they had 
created a material that stayed colder than its 
surroundings in direct sunlight’. Two members 
of the team, Shanhui Fan and Aaswath Raman, 
with colleague Eli Goldstein, founded a start-up 
firm, SkyCool Systems, and supplied Bisla’s 
panels. Since then, they and other research- 
ers have made a host of materials, including 
films, spray paints and treated wood, that stay 
cool in the heat. 

These materials all rely on enhancing a 
natural heat-shedding effect knownas passive 
radiative cooling. Every person, building and 
object on Earth radiates heat, but the planet’s 
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A thermal image of a panel witha 
‘super-cool’ coating outside 
Columbia University in New York City. 


blanket-like atmosphere absorbs most of it and 
radiates it back. Infrared rays between 8 and 13 
micrometres in wavelength, however, are not 
captured by the atmosphere and leave Earth, 
escaping into cold outer space. As far back as 
the1960s, scientists sought to harness this phe- 
nomenon for practical use. But passive radia- 
tive cooling is noticeable only at night: in the 
daytime, sunlight bathes us in much more heat 
energy than we can send into space. 

The new materials reflect abroad spectrum 
of light, in much the same way as mirrors or 
white paint do. In the crucial 8-13-pm part of 
the infrared spectrum, however, they strongly 
absorb and then emit radiation. When the 
materials point at the sky, the infrared rays can 
pass straight through the atmosphere and into 
space. That effectively links the materials to an 
inexhaustible heat sink, into which they can 
keep dumping heat without it coming back. 
As aresult, they can radiate away enough heat 
to consistently stay a few degrees cooler than 
surrounding air; research suggests that tem- 
perature differences could exceed 10 °Cin hot, 
dry places”’. David Sailor, who leads the Urban 
Climate Research Center at Arizona State Uni- 
versity in Tempe, has termed them super-cool 
materials. 

These materials might not only save on 
electricity bills, say enthusiasts, but also 
reduce a surge in demand for power-hungry 
refrigeration and air conditioning as the world 
warms. “My belief is that in four to five years, 
daytime radiative cooling systems will be the 
number one technology for buildings,” says 
Mattheos Santamouris at the University of New 
South Wales in Sydney, Australia, who himself 
is working to improve such materials. “It is the 
air conditioner of the future.” 

A few researchers have even suggested that 
the materials might be considered as part of a 
geoengineering strategy, to help Earth shed 
heat to counteract global rising temperatures. 
“Rather than try to block the incoming heat 
from the Sun, can we just make Earth emit 
more?” asks Jeremy Munday, a physicist at the 
University of California, Davis. 

But many scientists are cautious about these 
ideas. So far, theoretical estimates of how much 
electrical power can be saved have been based 
on data from small samples tested over short 
times. There are also doubts about the materi- 
als’ ability to workin a wide variety of climates 
and places. The cooling effect works best in dry 
climates and with clear skies; when it’s cloudy 
or humid, water vapour traps the infrared radi- 
ation. And the super-cool materials might not 
last in all weathers or fit easily to all buildings. 

Another unknown is whether consumers 
will embrace the idea. Even the simple meas- 
ure of replacing worn-out roofs with reflective 
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white ones to cool houses has not been widely 
adopted by homeowners, says Sailor. His 
modelling work, however, suggests that use 
of asuper-cool paint might double the energy 
savings compared with a white roof. “It’s a bit of 
agame-changer — potentially,” he says. 


Overcoming the Sun 


In 2012, Raman — who was completing his 
PhD with Fan on materials for harvesting solar 
energy — stumbled on old studies about pas- 
sive radiative cooling, an effect he’d not heard 
of. Realizing that no one had worked out how 
to use it under direct sunlight, he examined 
the optical properties a material would need 
to overcome the Sun’s heat. It must reflect the 
solar spectrum in wavelengths from 200 nano- 
metres to 2.5 pm even more effectively than 
white paint, whichis already up to 94% reflec- 
tive. And it must absorb and emit as close as 
possible to 100% of the wavelengths inthe cru- 
cial 8-13-~um range (see ‘Keeping their cool’). 

Allthis could be done by engineering mate- 
rials at the nanoscale, Ramanand Fan thought. 
Creating structures smaller than the wave- 
lengths of light that will pass through them 
should enhance the absorption and emission of 
some wavelengths and suppress that of others. 

The group came up with the idea to etch pat- 
terns into surfaces‘ and published it in 2013. 
Then the team submitted a proposal tothe US 
Advanced Research Projects Agency—Energy 
(ARPA-E) for funding to make it. 

“| immediately thought, ‘Wow, I'd really 
like to see somebody actually do this,” recalls 
Howard Branz, then a programme director at 
ARPA-E in Washington DC, and nowatechnol- 
ogy consultant in Boulder, Colorado. “There’d 
been alot of night-time radiative-cooling work, 
but to do it under broad, full sunlight is quite 
startling.” 

Branz gave the researchers US$400,000 and 
a year. With so little time, the Stanford team 
decided to simplify the design and try layer- 
ing materials in more familiar ways. To create 
something highly reflective, the researchers 
alternated four thin layers of materials that 
refract light strongly (hafnium dioxide) and 
weakly (silicon dioxide, or glass),a commonly 
used motif in optical engineering that works 
because of how light waves interfere as they 
pass through different layers. They used the 
same principle to amplify infrared emissions, 
depositing three thicker layers of the same 
materials ontop. 

When they tested their material outdoors’, 
it stayed almost 5 °C cooler than the ambi- 
ent temperature, even under direct sunlight 
of around 850 watts per square metre. (Ona 
bright, clear day at sea level, the intensity of 
sunlight directly overhead reaches around 
1,000 Wm”). 

After that success, ARPA-E funded other 
proposals for super-cool materials. Among 
these was an idea from Xiaobo Yin and Ronggui 


KEEPING THEIR COOL 


‘Super-cool’ materials stay colder than their 
surroundings even in direct sunlight, by emitting heat 
that can pass through the atmosphere and into space. 


Transparent atmosphere 

Earth’s blanket-like atmosphere absorbs most infrared 
wavelengths but is transparent to those between 

8 and 13 micrometres. 
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Super-cool materials are extremely reflective (even 
more so than white paint), so they are relatively 
unaffected by sunlight. They also absorb wavelengths 
between 8 and 13 um, then emit them into space. 
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Yang at the University of Colorado Boulder, 
who wanted to make materials at large scale. 
They chosetowork with cheap plastic and glass. 
Glass spheres of the right size — a few micro- 
metres across — emit strongly in the 8-13-um 
range. Embedding these ina 50-pm-thick film 
of transparent polymethylpentene — a plastic 
used in some lab equipment and cookware — 
and backing this with reflective silver was suf- 
ficient to create a super-cool material®. More 
importantly, the researchers could make the 
film with roll-to-roll technology that churns 
out 5 metres per minute. 

It turned out that many materials exhibit 
super-cooling if structured in the right way 
— not just exotic or speciality ones. In 2018, 
researchers at Columbia University in New York 
City and Argonne National Laboratory in Lem- 
ont, Illinois, reported asuper-cool paint, based 
on asprayable polymer coating®. Many poly- 
mers naturally emit in the infrared 8-13-um 
range because their chemical bonds, such 
as those between carbon atoms or between 
carbon and fluorine, eject packets of infrared 
light when they stretch and relax, explains team 
member Yuan Yang. The key was to strengthen 
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the polymers’ ability to reflect sunlight. 

Yang’s student Jyotirmoy Mandal — who is 
now a postdoctoral researcher in Raman’s lab 
at the University of California, Los Angeles — 
dissolved fluorinated polymer precursors in 
acetone witha small amount of water. This mix- 
ture can be sprayed onto asurface to create an 
even polymer coating with tiny water droplets 
dispersed through it. The volatile acetone dries 
first, followed by the water droplets, leaving 
behind pores that fill with air. The overall result 
is a white coating with pores inside that reflect 
the sunlight, Yang says. 

Last May, the Colorado team reported 
another material: a cooling wood, created with 
Liangbing Hu and Tian Li at the University of 
Maryland, College Park. Just like polymers, 
wood contains chemical bonds that emit the 
right kind of infrared radiation, says Li. A net 
cooling effect can be achieved by chemically 
removing a rigid component called lignin to 
make the wood reflective and compressing the 
product to align its cellulose fibres and amplify 
infrared emissions’. 

Scientists have also made super-cool thin 
films from polydimethylsiloxane (PDMS), a 
silicone material found in products such as 
lubricants, hair conditioners and Silly Putty, 
by spraying it onto a reflective backing. As 
recently as last August, Zongfu Yu at the Uni- 
versity of Wisconsin—Madison and Qiaoqiang 
Gan at the State University of New York at Buf- 
falo found that an aluminium film spray-coated 
with a100-ym layer of PDMS stayed 11 °C cooler 
than ambient air when placed in a campus car 
park in the middle of the day’. 


Staying cool 


Almost all the research teams have patented 
their inventions and are now trying to market 
them. Gan is working with industry partners, 
which he declined to name, to commercialize 
the PDMS-aluminium film. Columbia Univer- 
sity has licensed its super-cool paint to New 
York start-up MetaRE, founded by Mandal 
and Yang’s Columbia collaborator Nanfang 
Yu, for development. MetaRE is also working 
with industry to develop the paint for roofing, 
refrigerated transportation, storage and textile 
applications, says chief executive April Tian. 
The product is “highly competitive” with con- 
ventional paints, she says. 

Other start-ups have highlighted how 
much electricity their products could save. 
Fan and Raman have developed a proprietary 
system for SkyCool Systems’ panels. In 2017, 
they predicted that the system could reduce 
the amount of electricity a building uses for 
cooling by 21% during the summer in hot, dry 
Las Vegas, Nevada’, Raman says the panels will 
pay for themselves in three to five years. Yin 
and Ronggui Yang have started a company in 
Boulder called Radi-Cool, to commercialize 
the glass-embedded plastic. Last January, 
they reported that the material could reduce 
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ELECTRICITY AT NIGHT, 
WATERIN THE DAY 


Super-cool materials have added benefits. 


Materials that dump heat from Earth 

into space could have unexpected 
applications. They could, for instance, 
make it easier to harvest water from the 
atmosphere in the daytime. At night, water 
vapour condenses into dew on surfaces 
that lose heat to the clear night sky, an 
effect harnessed for centuries to capture 
water. Zongfu Yu at the University of 
Wisconsin-Madison and Qiaogiang Gan at 
the State University of New York at Buffalo 
found that an aluminium film coated in 
polydimethylsiloxane could not only stay 
cool, but also enhance water condensation 
during the day”. The pair started a 
company in Buffalo called Sunny Clean 
Water to commercialize the device. 

The temperature difference between a 
super-cool material and its surroundings 
could also be used to generate electricity 
at night — unlike solar panels, which work 
only in the day. Last September, Aaswath 
Raman, Shanhui Fan and Wei Li at Stanford 
University in California managed to 
produce a trickle of electricity — milliwatts 
per square metre — from such a nocturnal 
device’. That shows it’s possible to make at 
least enough electricity at night to power 
a small LED. That's an exciting proof of 
concept, says Howard Branz, a technology 
consultant in Boulder, Colorado. But 
electricity from solar panels can be stored 
in batteries to generate much larger flows 
of electricity, so it’s not yet clear whether 
the idea will be useful. 


electricity consumption for cooling inthe sum- 
mer by 32-45% if it were integrated with water 
chillers in commercial buildings in Phoenix, 
Arizona; Miami, Florida; and Houston, Texas’. 
Hu, meanwhile, has licensed the super-cool 
wood material to a Maryland-based firm he 
co-founded called InventWood. He predicts 
that it could save 20-35% of cooling energy 
across 16 US cities’. 

But these estimates are based on experi- 
ments and models that are too limited to be 
extrapolated to whole buildings in cities, cau- 
tions Diana Urge-Vorsatz, an environmental 
scientist at the Central European University 
in Budapest who specializes in climate-change 
mitigation. Actual energy savings and how 
quickly a super-cool material will pay for itself 
will depend ona building’s structure, location 
and weather conditions, adds Yin. 

Location is the biggest obstacle. “There 
are certain geographical regions where it just 
won't work because the atmosphere isn’t dry 
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Super-cool panels on the roof of a shop in Sacramento, California. 


enough,” says James Klausner, a mechanical 
engineer at Michigan State University in East 
Lansing who served as an ARPA-E programme 
director after Branz and has funded some 
proposals inthe field. But that’s not too off-put- 
ting, he says, because the regions where the 
effect works well are arid areas such as the 
southwestern United States or the Middle East, 
which have high demands for air conditioning. 

Another challenge is that radiative-cooling 
systems might increase heating costs in win- 
ter. To address this problem, Santamouris is 
trying to introduce a liquid layer ontop of the 
super-cool materials that would freeze when 
the temperature drops low enough. Once the 
liquid solidifies, radiation can no longer escape 
tospace, so the cooling effect is cut off. And last 
October, Mandal and Yang reported another 
way to stop overcooling™. If they fill the pores 
of their polymer coating with isopropanol, the 
coating starts to trap heat rather than shed it. 
This can be reversed by blowing air through 
the pores to dry them out. 

There’s another issue: the materials achieve 
super-cooling only if they can send their radi- 
ation directly to the cold heat sink of outer 
space. In an urban setting, buildings, people 
and other objects can get in the way, absorbing 
the heat and re-emitting it. The best-perform- 
ing materials currently remove heat at arate of 
around 100 Wm”. Gan and Yu hope to double 
that by positioning their films perpendicular 
to the roof so that emissions can escape from 
both surfaces. But this will require adding 
materials around the films that can reflect the 
emissions up into the sky. 

Researchers are looking at other ways to 
increase the materials’ cooling ability. Last 
October, Evelyn Wang at the Massachusetts 
Institute of Technology in Cambridge and her 
colleagues reported that covering a radia- 
tive-cooling film witha light, insulating aerogel 
kept the structure 13 °C cooler than its sur- 
roundings at noon in the dry Atacama Desert 
in Chile, compared withjust1.7 °C without the 
aerogel’. The aerogel concept could be used 
with other super-cool materials, she says. 
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Dreams of using the super-cool materials for 
geoengineering to mitigate global warming 
seem further off, and unlikely froma practical 
perspective. Last September, Munday used 
“back-of-the-envelope calculations” to sug- 
gest that current rising temperatures could be 
balanced by covering 1-2% of Earth’s surface 
with existing materials that generate around 
100 Wm” of cooling power in the daytime”. 
But because solar panels still don’t reach that 
level of cover after decades of development, 
it seems impossible that this nascent technol- 
ogy could dosointime to be useful, says Mark 
Lawrence, a climate scientist at the Institute 
for Advanced Sustainability Studies in Pots- 
dam, Germany. As with any geoengineering 
proposal, Munday acknowledges the possible 
unintended consequences of disturbing pre- 
cipitation patterns and local climates — which 
Urge-Vorsatz agrees are likely to bea problem. 

Still, passive radiative cooling might have 
many benefits, says Raman (see ‘Electricity at 
night, water in the day’). It could, for instance 
help to stop solar panels losing efficiency as 
the temperature rises. And all electricity gen- 
eration and conversion processes produce 
waste heat, says Yin, even ifthey use renewable 
energy rather than fossil fuels. “This is the only 
technology that harnesses all this wasted heat 
and dumps it back to space,” he says. 


XiaoZhi Lim is a freelance writer in Natick, 
Massachusetts. 
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Books & arts 


2020 inscience & culture 


The new uncanny valley, climate on film, 
imperilled elephants and Earth Day at 50: 
what’s coming this year. By Nicola Jones 


Ecovisionaries 
Royal Academy of Arts, London. 
Until 23 February. 


Sustainability demands creative 
thinking. At the Royal Academy, 
architects, artists and designers 
are collectively reimagining 
our relationship with nature 
amid challenges ranging from 
climate change to species 
extinction. New commissioned 
works include The Substitute by 
Alexandra Daisy Ginsberg. This 
life-size digital reproduction 

of the critically endangered 
northern white rhinoceros 
(Ceratotherium simum cottoni) 
was made using film footage 
enhanced by data from 
artificial-intelligence company 
DeepMind. Older works explore 
endangered fish in Africa’s Lake 
Victoria (Tue Greenfort’s 2017 


Tilapia) and mining lithium for 
batteries in the Atacama Desert 
(research studio Unknown 
Fields’ In The Breast Milk of the 
Volcano, 2016-18; pictured). 


Troy: Myth 
and Reality 


British Museum, London. 
Until 8 March. 


The ancient city of Troy has many 
faces. There is the legendary, 
war-torn Troy of Homer’s epic 
Iliad and Odyssey. And there 

are the Troys uncovered by 
archaeology in modern-day 
Turkey: a ‘layer cake’ of sites 
spanning 3000 BC to AD 500. 
This blockbuster exhibition 
covers both, and includes 
artefacts stretching back three 
millennia, from Athenian pottery 
(detail pictured) to Roman 
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sculpture. Evidence suggests 
that a battle between the Hittite 
empire and Greece ( which they 
called Ahhiyawa) might have 
been the real Trojan War. 


Origins: Fossils 
from the Cradle of 
Humankind 


Perot Museum of Nature and 
Science, Dallas, Texas. 
Until 22 March. 


Hominin fossils rarely leave their 
countries of origin. Now, two 
South African skeletons from 
recently discovered branches 
of the hominin family tree are 
on display in the United States: 
Karabo (Australopithecus 
sediba) and Neo (Homo naledi). 
Both are from digs led by 
palaeoanthropologist Lee 
Berger. Researchers will attempt 
to 3D-print missing parts of 
Karabo’s skeleton, using scans of 
rocks from the discovery site. 
People keen to learn more 
about hominin history can 
look to the Iziko South African 
Museum in Cape Town, where 
the Origins of Early Sapiens 
Behaviour Exhibition has been 
expanded. 


Sahel: Art and 
Empires on the 
Shores of the Sahara 


The Metropolitan Museum of Art, 
New York City. 
30 January - 10 May. 


Africa’s Sahel (Arabic for ‘shore’) 
is a vast semi-arid band spanning 
Senegal, Mali, Mauretania 

and Niger. This culturally 

rich region, now plagued by 

war and desertification, has a 
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Earth 
Day 50th 
Anniversary 


22 April 


For the latest reviews 

of books and art events 
published by Nature, visit: 
go.nature.com/2se869n 


In 1970, some 20 million people across the United 
States joined the first Earth Day to protest against 
the constellation of problems plaguing the planet, 
from toxic dumping to extinctions. This year, the 
Earth Day Network is launching a series of events to 
kick anniversary protests into high gear, including 
acitizen-science mobile app and a registry of 
Earth-inspired art, theatre, dance and film. See 
go.nature.com/36xxile for more. 


fascinating history. A succession 
of influential kingdoms held 
sway here, fromthe empire 

of Ghana (AD 300-1200) to 

that of Segu (1640-1861). This 
show is the culmination of a 
four-year partnership with 
academics from Yale University 
in New Haven, Connecticut, 
Dakar’s Fundamental Institute 
of Black Africa and elsewhere. 
Featured are manuscripts, 
textiles and sculptures ranging 
from a3-tonne eighth-century 
Senegalese megalith in the form 
of alyretoa7-centimetre statue 
of a female torso more than 
4,000 years old. 

As the Met celebrates its 
150th birthday in 2020, look 
out for other shows, from 
Making Marvels: Science 
and Splendor at the Courts 
of Europe (until 1 March) to 
Cubism and the Trompe!’Oeil 
Tradition (24 November 2020 to 
28 February 2021). 


Countryside: Future 
of the World 


Guggenheim Museum, New York City. 
20 February - Summer. 


Cities house more than half of 
humanity, but cover less than 
3% of Earth’s non-icy lands. 
Here, architect and urbanist 
Rem Koolhaas turns to the rural. 
The show will examine artificial 
intelligence and automation, the 
effects of genetic experiments, 
political radicalization, 
migration, large-scale territorial 
management, human-animal 
ecosystems and the impact of 
the digital. 


Neri Oxman: 
Material Ecology 


Museum of Modern Art, 
New York City. 
22 February - 25 May. 


Can useful structures be 
grown rather than built? So 
asks Neri Oxman, a medically 
trained architect who directs 
the Mediated Matter group at 
the Massachusetts Institute of 


Technology (MIT) Media Lab in 
Cambridge — a facility renowned 
for fomenting academic 
creativity at the intersection 

of art and science. This show 
will feature 8 major projects 
from Oxman’s 20-year career in 
‘material ecology’: the realm of 
biologically inspired or created 
products. Silk Pavilion (2013) 
used thousands of silkworms to 
spina dome inan MIT building; 
Ocean Pavilion (2014) built 
structures out of chitosan, a 
polymer found in crustacean 
shells. 


Uncanny Valley: 
Being Humanin 
the Age of Al 


de Young Museum, 
San Francisco, California. 
22 February - 25 October. 


In 1970, Japanese engineer 
Masahiro Mori noted that 
lifelike androids occupy an 
‘uncanny valley’ — arealm 
between non-human and fully 
human that evokes discomfort, 
even revulsion. This exhibition, 
a few dozen kilometres from 
Silicon Valley, explores modern 
denizens of this uncanny realm. 
A statue in the museum garden 
has an active beehive for a head 
(an allusion to the complex 
workings of a neural network; 
pictured); termite mounds 
symbolize the minions of the 
crowdsourced marketplace; 
abandoned patents are 
3D-printed to bring them to 
life. The biases and pitfalls of 
artificial-intelligence algorithms 
are thrown into stark relief bya 
host of art and film projects. 
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Books & arts 


At 


CLIMATE CHANGE 
ONSHOW 


Stage and screen events 
highlighting a climate crisis. 


Climate Speaks 2020 

Climate Museum, New York City. 
January — June. 

The first US museum dedicated 
to the climate emergency is 
hosting its second spoken-word 
programme for teenagers. 
Successful applicants will 
spend six months exploring links 
between climate change, social 
justice and the arts, leading toa 
May performance (pictured, one 
of last year’s participants). 


Last Catastrophist 

Boston Center for the Arts, 
Massachusetts. 

24 January - 8 February. 

In this dystopian science- 
fiction play by David Valdes, 
climatologists Marina and Lucia 
are holed up in Iceland, hiding 
from the threats of anti-climate- 
science cabal Eternal Sunshine. 


The Flight of the Hummingbird 
Touring schools in British 
Columbia, Canada. 

January - May. 

Based on a graphic novel by 
Indigenous Haida artist Michael 
Nicoll Yahugulanaas, this opera 
brings issues of climate change, 
social justice and personal 
responsibility to an audience 
aged 5 to 15. 


Billie Eilish Eco-Village 

March - July. 

Alternative singer-songwriter 
Billie Eilish has partnered with 
sustainability-focused non-profit 
organization Reverb to ‘green’ 
her 2020 world tour. Venues 
from Miami to London will host 
‘eco-villages’ where fans can 
learn about climate change. 


Greta vs Climate 

Director Nathan Grossman is 
planning a documentary on 
pioneering teenage activist 
Greta Thunberg — who founded 
the Friday student climate 
strikes that have swept the 
globe. 
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Serpentine Galleries 
50th Anniversary 


The Serpentine Gallery, London. 
4 March - 17 May. 


The Serpentine Galleries will 
celebrate its SOth anniversary 
with events on issues 

from sustainability to new 
technologies, guided by curator 
of ecology Lucia Pietroiusti. 
Two shows will spearhead 

the year. Cao Fei: Blueprints 
will feature virtual reality and 
installations alongside the 
artist’s film Nova (2019), all 
examining the urbanization of 
Beijing’s Jiuxianqiao district. 
And Amsterdam-based Studio 
Formafantasma presents 
Cambio, a project onthe 
ecological legacy of forestry 
and wood products. A summer 
exhibition will kick off the 
multiyear programme Back to 
Earth, with works to spur action 
against the climate crisis (see 
also ‘Climate change on show’). 


Janet Echelman’s 1.8 
Renwick Gallery of the Smithsonian 
American Art Museum, 
Washington DC. 


3 April 2020 - 14 August 2022. 


The initial inspiration for 

Janet Echelman’s ephemeral 
installations were fishing 

nets on Indian beaches and 
Lithuanian lace. She scales them 
up with high-tech materials to 
create building-sized works 
that reflect the global impacts 
of seismicity. Her 2010 work 
1.26 — a fibre sculpture sparked 
by computer simulations of that 
year’s earthquake and tsunami 
in Chile, which shortened the 
day by 1.26 microseconds — has 
been hung between buildings, 
from Colorado to Singapore. 1.8, 
another installation of knotted 
fibres, was sparked by the 
catastrophic 2011 earthquake 
and tsunami in Tohoku, Japan, 
which cut off 1.8 microseconds. 


Hiroshima: 75th 


Anniversary 
6 August. 


The Summer Olympic Games 

in Tokyo will be threaded with 
memories of the Second World 
War nuclear attack on Hiroshima 
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75 years earlier (pictured, the 

9 August Nagasaki attack). The 
Hiroshima Peace Memorial, for 
example, will be on the Olympic 
opening ceremony’s torch- 

relay route. Memorial events 
will also be held around the 
world. The Japanese American 
National Museum in Los Angeles, 
California, in partnership with 
Hiroshima and Nagasaki, will 
display artefacts belonging to 
victims of the attacks (Under a 
Mushroom Cloud: Hiroshima, 
Nagasaki, and the Atomic Bomb, 
will show until 7 June). Andin 
Hiroshima itself, the annual 

6 August Peace Memorial 
Ceremony will mark the moment 
with silence, and a procession of 
thousands of lanterns floating 
down the Motoyasu River. 


Elephants and 
Us: Considering 
Extinction 


National Museum of American 
History, Washington DC. 
Until 13 September. 


From the nineteenth to the 
mid-twentieth centuries, ivory 
was aluxury commodity used 

to produce piano keys, billiard 
balls, buttons and hair combs. 
African elephant populations 
plummeted from more than 

ten million to fewer than one 
million. This exhibition tracks US 
work to stem the trade, starting 
with the enactment of the 
African Elephant Conservation 
Act in1989. Yet a poaching surge 
that began in 2006 threatens 
the bush and forest elephants of 
Africa: a 2015 count was the first 
in 25 years to report a declinein 
elephant numbers. 


Turner and the 
Modern World 


Tate Britain, London. 
28 October 2020 - 7 March 2021. 


Artist J. M. W. Turner was a 
technophile, famously capturing 
the Industrial Revolution in 
paint. Tate Britain celebrates his 
fascination with machine power 
in this major exhibition. The 
show — featuring works from 

the 1790s to the steam boats and 
railways of the 1840s — coincides 
with the showing of finalists for 
the year’s Turner Prize. 


SCIENCE FICTION 
ONFILM 


Apocalypses and Als. 


BIOS 

US opening 2 October. 

Tom Hanks is back. Playing 

a sickly inventor and the 

last human left on a post- 
apocalyptic Earth, he creates 
a robot to keep him and his 
dog company, and to protect 
the dog after he dies. Director 
Miguel Sapochnik boasts an 
epic CV, including the Game of 
Thrones episode ‘Battle of the 
Bastards’. 


Dune 

US opening 18 December. 
David Lynch’s 1984 adaptation 
of Frank Herbert’s sprawling 
1965 cult-classic novel 
(featuring giant sand worms 
and battles over a mind-altering 
drug called the spice) was a 
box-office failure. Fans are 
hoping that Denis Villeneuve’s 
version — the first of a planned 
two-parter — will prove more 
satisfying. 


The Division 

Rumoured Netflix release. 

On Black Friday, bioterrorists in 
New York City seed banknotes 
with a modified strain of 
smallpox called the Green 
Poison. Jake Gyllenhaal and 
Jessica Chastain star in this 
video-game adaptation directed 
by David Leitch. 


Robocalypse 

Rumoured cinematic release. 

In the 2011 novel of the same 
name, artificial intelligence 
Archos R-14 sets out to preserve 
life on Earth — by wiping out 
human civilization. Produced by 
Steven Spielberg, this long- 
delayed project might finally 
come to fruition in 2020 under 
the direction of Michael Bay. 


The Invisible Man 

US opening 28 February. 
Loosely based on the 1897 novel 
of the same name by H. G. Wells, 
this psychological thriller — 
directed by Leigh Whannell and 
starring Elisabeth Moss (of The 
Handmaid's Tale) — follows the 
tribulations of a woman stalked 
by an invisible ex-boyfriend. 


Elisabeth Moss plays Cecilia Kass in The Invisible Man. 
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One of five water-reuse plants in Singapore, which together supply about 40% of the nation’s water for drinking and other uses. 


Drink more recycled wastewater 


Cecilia Tortajada and Pierre van Rensburg 


There is no room for 
squeamishness in the face 
of the world’s growing 
water shortage — three 
steps could vastly improve 
the image of reused water 
for drinking. 
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rinkable water is becoming increas- 
ingly scarce. Population growth, 
pollution and climate change mean 
that more cities are being forced to 
search for unconventional water 
sources’. In a growing number of places, 
drinking highly treated municipal waste- 
water, called ‘reused water’, has become the 
best option — and, insome cases, the only one 
(see ‘What is reused water?’). 

But anxieties about reused water, often 
heightened by sensational media coverage, 
have prevented several projects from going 
ahead. Some people are concerned that 
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reused water will contain more pathogens 
and chemicals than drinking water sourced 
from lakes or rivers. Others are simply dis- 
gusted by the idea of consuming water that 
has passed through toilets and drains before 
being treated. 

Around two billion people now live in 
countries with ‘high water scarcity’ — mainly 
in northern Africa and western, central and 
southern Asia”. With the global population 
predicted to increase from 7.7 billion today 
to 10 billion in 2050 — an estimated 70% of 
whom will live in urban areas — the demand 
for safe drinking water is set to rise drastically. 


ROSLAN RAHMAN/AFP/GETTY 


WHATIS 

REUSED WATER? 
‘Reused’ water comes from highly 
treated wastewater. 


In middle- and high-income countries, 
domestic (municipal) wastewater — from 
houses, shops and businesses, but not from 
industries — is generally collected, treated 
in sewage plants and discharged into rivers, 
lakes and other natural water bodies. The 
‘raw water’ is then collected, treated again 
and used by towns and cities downstream 
for drinking, agriculture, landscape irrigation 
or industrial processes. 

An alternative strategy is to treat municipal 
wastewater more rigorously so that it can be 
used for drinking. After it goes through the 
sewage plant, it is treated in a second plant 
(and sometimes a third) using advanced 


According to a 2019 United Nations assess- 
ment, water demand in general is likely to 
increase by 20% to 30% between now and 
2050 (ref. 3). 

Thus, conserving water is paramount. 
Supply infrastructure needs to be improved 
and better managed, including through the 
use of smart sensors and other technologies. 
Economic instruments, such as appropriate 
pricing, can boost efficient usage. Legislation 
needs to be implemented to lessen pollution. 
Andall sectors — public and private — need to 
be educated about the importance of saving 
water, as does society more broadly. 

High on the list should be efforts to 
investigate the benefits and risks of drinking 
reused water, including ways to make it more 
acceptable to consumers. 


Public perception 


Opposition from citizens has stalled several 
projects aimed at providing people with 
reused drinking water over the past two 
decades. 

In 2000, the Los Angeles Daily News ran 
an article titled ‘Tapping toilet water’ about 
the East Valley Water Recycling project that 
had begun in the San Fernando Valley in Los 
Angeles, California, in 1995. People in the 
region were worried that the water, which they 
thought would be supplied only to those liv- 
ing in low-income areas, would be unsafe. The 
project was politicized by mayoral candidates, 
and the Los Angeles Department of Water 
and Power, which had proposed the project, 


chemical, biological and physical treatments. 
The water is then fed directly into the 
drinking supply system or into the natural 
system (rivers, lakes, aquifers or reservoirs). 
In the latter scenario, water is subsequently 
extracted from the natural system, treated 
again and then supplied to people for 
drinking or other uses. In both cases, the 
resulting water is termed reused’. 

In many places, discharges of wastewater 
treated in the usual way (so just once) are 
making up an increasing proportion of 
river flow’. Yet authorities still consider 
such rivers ‘natural sources of freshwater’. 
Because downstream treatment methods 
might not be adapted to the actual quality 
of the raw water, this is increasingly posing 
a health risk. Thus, treating wastewater 
to higher standards in a controlled 
environment and reusing it for specific 
purposes can make more sense, for both 
economic and health reasons. C.T. & P.v.R. 


eventually decided not to implementit. Since 
then, the reused water has been used only in 
irrigation and industry’. 

In Queensland in Australia, residents 
successfully opposed a reuse project in 
Toowoomba in 2006 and the Western Corri- 
dor Recycled Water Scheme in southeastern 
Queensland in 2009, even while the country 
was experiencing the most severe drought 
since records began. 


“Reused drinking water is 
actually subject to stricter 
regulations, monitoring, 
assessments and auditing.” 


In Toowoomba, 62% of around 95,000 peo- 
ple voted against the project, largely because 
of safety concerns and fears that it would harm 
industries including tourism, food process- 
ing and property sales®. The Western Corridor 
Recycled Water Scheme cost Aus$2.4 billion 
(US$1.6 billion) to construct and aimed to 
produce up to 230,000 cubic metres of water 
per day to cover around 30% of southeastern 
Queensland’s water supply needs. But in 2009, 
following political pressure and a break inthe 
drought, it was decided that the scheme would 
produce drinkable reused water only when 
the levels of the reservoir (where the reused 
water would be stored) fell below 40% of full 
capacity®. 

Public scepticism over water safety is not 
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completely unwarranted. In the United States 
and Canada, for example, there are still com- 
munities that lack access to safe drinking water 
— predominantly among low-income and 
minority ethnic populations’. In several cases, 
drinking water has been shown to be unsafe 
for the population, such as in Flint, Michi- 
gan, in 2014, and in several cities in Canada 
this year. In all of these instances, the water 
was found to contain higher concentrations 
of lead than those deemed safe by the regula- 
tory authorities. In October last year, testing 
revealed that nearly 300 drinking-water wells 
and other water sources in California contain 
traces of chemicals known as PFASs (per- and 
poly-fluoroalkyl substances) that have been 
linked to certain cancers and other health 
problems. 

Currently, however, reused drinking water 
is actually subject to stricter regulations, 
monitoring, assessments and auditing than 
standard drinking water. 


Image enhancement 


Three steps would improve the image of 
reused water. 

Do more research. Wastewater contains 
hundreds of known chemical and pathogenic 
contaminants that, if not treated properly, 
can cause serious acute and chronic diseases, 
suchas cholera or typhoid. Also, new chem- 
icals are continually being introduced to 
the market, and new strains of bacteria and 
viruses discovered. Investigators at univer- 
sities and those working for water-utility 
companies must study, quantify and effec- 
tively mitigate any emerging risks, and must 
keep appraising the overall benefits and 
costs of reused water on both human and 
environmental health. 

Especially as technologies for detection 
become more sensitive, more affordable and 
widely available, the presence of pathogens 
and chemicals must be continuously moni- 
tored (by daily or even more frequent test- 
ing®) to protect the public from problems that 
might emerge’. Chronic risks from long-term 
exposure to low levels of toxic chemical sub- 
stances are just as important to track as acute 
risks resulting from a one-off exposure”. 

In middle- and high-income countries, 
drinking water, whether or not it is reused 
water, must meet national, regional and local 
health standards (or whichever apply) for 
pathogens, chemicals and any other types of 
contaminant’. So far, water agencies in cities 
using reused water have been able to meet 
these standards — through the use of multiple 
barrier-treatment steps from chemical to 
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Comment 


THREE 
SUCCESSES 


People in Windhoek in Namibia, Orange 
County in California and Singapore had to 
start drinking reused water. 


Located in an arid to semi-arid environment 
with little access to surface water sources, 
Windhoek was the first city to create a 
drinking-water supply from reused water in 
1968. The Goreangab Water Reclamation 
Plant currently produces around 24% of 
Windhoek’s drinking water (21,000 cubic 
metres per day)'®. During the 2014-16 
drought, supplies from nearby reservoirs 
could meet only 10% of demand, instead 
of the expected 75%. Reused water from 
Goreangab then contributed up to 30% of 
the city’s total water supply. 

Operational since 2008, the Orange 
County Groundwater Replenishment System 
has become the largest reuse facility in the 
world. It produces 379,000 m* of drinkable 


microbial, through real-time monitoring of 
microbes and chemicals, and by using various 
risk-management strategies throughout treat- 
ment and distribution”’”. 

Improve public outreach. Water-utility 
companies must develop more comprehen- 
sive strategies on information dissemination, 
public consultation, education and 
engagement. 

Community engagement is not — and must 
never be perceived to be — ameansto convince 
the public that certain projects should go 
ahead. Rather, it should be about setting up 
platforms, so that people’s concerns can be 
heard and addressed early on, even if that 
means modifying plans. 

Some successful projects can offer a 
model. In the 1990s, for instance, the city of 
San Diego in California planned a water reuse 
project to reduce its dependence on water 
transfers from the Colorado River and other 
sources. The project was initially supported 
by the public. But that support fell away for 
various reasons, including inconsistencies in 
the information provided by expert panels on 
the safety of recycled wastewater. Following 
the use of terms in the media suchas ‘toilet to 
tap’ and ‘sewerage beverage’, and claims that 
the reused water would be supplied only to 
low-income communities, resistance was such 
that the city council converted the projecttoa 
non-drinkable scheme in 1999 (ref. 13). 

Yet San Diego still needed more drink- 
ing water. So in 2004, the company Public 
Utilities decided to develop more-compre- 
hensive strategies for public outreach and 
education. Among the suite of approaches 
deployed were an online and telephone 
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water per day. The project is widely accepted 
in part because the utility company, the 
Orange County Water District, prioritized 
public information and engagement from 
the start. 

Singapore spent decades planning a 
water reuse scheme now called NEWater. 
By the time the project was launched in 
2003, comprehensive communication and 
education efforts on long-term safety and 
reliability issues, involving the government 
and other decision-makers, had already 
been established. Today, NEWater supplies 
about 40% of Singapore’s drinkable and 
non-drinkable water. If all goes to plan, by 
2060, it will meet 55% of the city’s water 
demands”. Most people in Singapore are 
aware that their island city state is short 
of water, being too small to store the 
rainfall it receives. And they appreciate the 
importance of NEWater. On each Singapore 
National Day (9 August), thousands of 
people who attend the celebrations are 
given bottles of NEWater, which they drink 
without qualms. C.T. & P.v.R. 


survey, research involving focus groups, 
opportunities for city staff to discuss the 
project with San Diego voluntary service 
organizations and others, and a dedicated 
website providing information. 

These efforts paid off. In 2004, only 26% 
of those surveyed approved of water reuse. 
By 2012, 73% did. The city approved the ‘Pure 
Water San Diego’ project in 2013 (ref. 14). It 
is expected to produce some 114,000 m? of 
drinking water per day by 2023 and to supply 


“Water-utility companies 
should start implementing 
reuse projectsin places 
where the needis greatest.” 


one-third of the city’s water needs by 2035. 

Implement projects where need is great. 
Competent water-utility companies should 
start implementing reuse projects in places 
where the need is greatest. They will need to 
have sufficient knowledge, technical know- 
how, staffing levels and financial capacity, and 
be operating in cities where there are strict 
water-quality regulations. Once such schemes 
have been proved safe and effective in places 
where the stakes are high, others will be more 
likely to support similar projects in their own 
communities. 


Keys to success 

The key to these strategies working is the 
continuous involvement of all stakeholders 
— from city mayors to national governments, 
from businesses and local health and medical 
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boards to community and environmental 
groups, religious leaders and the media. 

At least three important economic centres 
— Singapore, Windhoek in Namibia and 
Orange County in California — would not have 
progressed to where they are today without 
reused drinking water (see ‘Three successes’). 
In fact, without these reuse projects, the strict 
water rations that were likely to result could 
have had severe impacts on socio-economic 
development. Moreover, reused water can 
benefit streams, rivers, lakes, wetlands and 
aquifers, in part because the excess water 
from such projects that is returned to natural 
systems is of better quality than standard 
treated wastewater®. 
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Readers respond 


Correspondence 


Equality drives can 
silence women 


Gender-equality initiatives in 
academia can have unintended 
drawbacks (see C. Tzanakou 
Nature 570, 277; 2019). Counter- 
intuitively, they can result in 
the serious under-reporting of 
sexual harassment in academia, 
according to the 2019 European 
Gender Summit at which! 
chaired a session (see gender- 
summit.com). 

Universities recruiting women 
academics through gender- 
equality initiatives search for 
top-tier talent. Those that 
receive extra funding for such 
initiatives do not necessarily 
look kindly on staff who speak 
out about harassment or 
unequal treatment. There are 
reports of leaders exposing 
whistle-blowers to retaliation 
tactics such as intimidation, 
exclusion and silencing 
(D. Fernando and A. Prasad Hum. 
Relat. 72, 1565-1594; 2019). 

The research output of 
whistle-blowers can plunge 
under such harrowing 
circumstances. They lose trust 
in the institutions they worked 
so hard to become a part of. 
Moreover, witnesses to such 
retaliatory practices become 
reluctant to report harassment. 

Universities must embrace 
complaints if they are to achieve 
diversity and inclusivity. 
Otherwise, recruiting top 
women academics through 
gender-equality initiatives 
could become an unintentional 
search-and-destroy mission. 


Susanne Tauber Groningen, 
the Netherlands. 
s.tauber@rug.nl 


China’s shades 
of greening 


Your view that China’s 
re-vegetation of its deserts 
could exacerbate water 
shortages risks oversimplifying 
an incredibly complex eco- 
restoration problem (Nature 
573, 474-475; 2019). 

Far from just planting trees in 
arid areas, China’s re-vegetation 
codes vary for different regions 
and greening programmes. 
The nationwide Grain-to-Green 
programme, for example, 
aims to restore unstable and 
low-productivity farmlands to 
forest or natural vegetation. In 
humid areas, research optimizes 
greening programmes for 
plant selection and socio- 
economic benefits. And China’s 
re-vegetation projects are 
confined to a range that local 
water resources can sustain. 

Re-vegetation, like any eco- 
restoration strategy, isnota 
catch-all solution to carbon 
sequestration, soil erosion 
and flooding. But, rather than 
worrying mainly about water 
consumption, Chinese and 
other scientists are investigating 
the nexus of vegetation, soil, 
water, ecosystems and human 
society. 
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Earthquakes: heed 
shocks and patterns 


Being able to distinguish 
foreshocks and aftershocks 

of earthquakes in real time 
could be useful for earthquake 
prediction (see L. Gulia and 

S. Wiemer Nature 574, 193-199; 
2019). For example, the authors 
claim that — in retrospect 

— their method could have 
predicted the biggest such 
event in the 2016-17 cluster of 
earthquakes that occurred in the 
Apennines in central Italy: the 
magnitude-6.6 earthquake that 
hit the town of Norcia in October 
2016. There were no casualties, 
yet the death toll froma 

similar event in the region — 
the Avezzano earthquake of 
magnitude 6.7 inJanuary 1915 — 
was 30,000. 

How could this difference be 
explained? It could be because 
Italy’s Major Risk Committee, 
of which we were members 
at the time, found that a large 
event had a higher probability 
of occurring than usual, based 
onthe persistence of the 
earthquake sequence inthe 
region, and recommended 
putting the entire area under 
official alert. The committee 
issued a warning 40 hours 
ahead of the earthquake to the 
public, the press and the Civil 
Protection organization (see 
go.nature.com/2ecmvwk). As 
aresult, prefects and mayors 
enforced mass evacuation. 
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Testosterone’s role 
in ovulation 


As authors of the book reviewed 
by Randi Epstein (Testosterone: 
An Unauthorized Biography, 
Nature 574, 474-476; 2019), we 
wish to clarify two issues. 

The first concerns Epstein’s 
assertion that testosterone 
and its precursor, DHEA, have 
arolein the maturation of 
ovarian cells. She suggests that 
“DHEA might boost fertility 
directly or as a mediator of 
oestrogen production”. But 
our reading of the evidence 
indicates that DHEA’s positive 
effect on fertility is not 
because it mediates oestrogen 
production but because it is 
converted to testosterone. In 
our book, we describe studies 
in animal models showing 
that blocking the conversion 
of DHEA to oestrogen doesn’t 
reduce DHEA’s effects, whereas 
knocking out androgen 
receptors creates major fertility 
problems in females, including 
premature ovarian failure. 

The second issue concerns 
Epstein’s implication that our 
case for testosterone’s role in 
ovulation rests on interviews 
with a single clinician. In fact, 
our conclusions are based on 
more than a dozen studies in 
non-human animals, and on 
acomprehensive analysis of 
original research and review 
articles on the use of DHEA or 
other androgens to boost the 
response to fertility treatment 
in women. The interview with 
the clinician simply served asa 
‘hook’ for the story. 


Rebecca Jordan-Young Barnard 
College, New York, New York, 
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Tuberculosis vaccine finds 
an improved route 


Samuel M. Behar & Chris Sassetti 


A widely used vaccine against tuberculosis has now been 
shown to provide almost complete protection when injected 
intravenously. This is a striking improvement over vaccination 
through the typical intradermal route. See p.95 


Tuberculosis is the deadliest human infection, 
killing 1.5 million people in 2018 alone 
(go.nature.com/2kbuiq). It is widely accepted 
that an effective vaccine against the bacterium 
responsible, Mycobacterium tuberculosis, 
would be the most practical way to control the 
disease. However, the pathogen is often able to 
resist the immune responses elicited by vacci- 
nation. This has raised the question of whether 
it is possible for a conventional vaccine to 
confer sterilizing immunity against TB — a 
gold-standard immune status for vaccines, 
under which disease is prevented and the path- 
ogen completely eliminated, often before it 
can even establish a productive infection. 
On page 95, Darrah et al.' provide a resound- 
ing answer to this question by showing that 
near-complete protection from TB infection 
can be conferred using acentury-old vaccine, 
simply by changing its route of administration. 

The only currently licensed vaccine 
against TB is a live strain of the related path- 
ogen Mycobacterium bovis, the virulence 
of which was attenuated in the laboratory 
between 1908 and 1921. The strain, known 
as bacille Calmette—Guérin (BCG), has been 
administered to more than one billion people 
(go.nature.com/2cxwew6) since then (Fig. 1). 

The BCG vaccine is effective against some 
deadly early-childhood forms of TB. However, 
its ability to prevent the transmissible pul- 
monary form, which is the dominant form in 
adults, has been patchy”: it confers protection 
for some groups of people insome countries, 
but is generally insufficient to reduce the 
number of active TB cases in countries where 
the infection is endemic. Despite these limi- 
tations, BCG remains the only TB vaccine to 
confer protection in large-scale trials’. The 
mechanisms that determine its efficacy area 
topic of much interest. 


BCG is typically given as an injection into 
the dermal tissue that lies just beneath the 
outer layer of the skin. This injection site is 
convenient and contains specialized cells 
that stimulate immune responses. However, 
vaccines that activate immune cells at the 
site of potential infection can be more effec- 
tive at destroying invading pathogens. Thus, 
current immunological thinking suggests 


that vaccines administered directly into the 
lung or the upper airways would be better at 
preventing pulmonary infections, including 
influenza and TB. Darrah and colleagues there- 
fore investigated whether a different route of 
BCG administration could improve protection 
against pulmonary TB. 

Darrah etal. performed their analysis using 
rhesus macaques, because TB infection in 
these monkeys closely mirrors the human 
disease. They evaluated five vaccination strat- 
egies. Animals were given the BCG vaccine in 
one of the following ways: at the standard 
dose through the conventional intradermal 
(i.d.) route; at a higher-than-normal dose 
intradermally; by means of an aerosol to 
inoculate the lung; through a combination of 
the high dose i.d. and inoculation by aerosol; 
or through an intravenous (i.v.) injection. The 
authors exposed the macaques to M. tubercu- 
losis six months after vaccination, and tracked 
disease progression to determine how the 
administration route and dose of the vaccine 
affected protection against the infection. 

Vaccinations given intradermally or by 
aerosol conferred, at best, modest pro- 
tection from pulmonary TB. By contrast, 
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Figure 1| Ampoules of the BCG vaccine against tuberculosis. This vaccine has been used for almost a 
century, typically given as an injection just under the skin. Darrah et al.' now provide evidence in monkeys 
that the vaccine’s efficacy can be greatly improved using intravenous injection. 
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i.v. vaccination afforded nearly complete 
protection from the disease. Strikingly, the 
researchers could not detect any trace of the 
pathogen in six out of ten animals that received 
thei.v. vaccination, indicating that the infection 
had been either prevented or cleared. Three of 
the other monkeys also showed high levels of 
protection. Thus, the route of BCG inoculation 
clearly affects immunity, and thei.v. route con- 
fers by far the strongest protection against TB. 

What makes i.v. BCG vaccination so 
effective? Clear immunological correlates 
of protection (characteristics indicative of 
immunity against a disease) proved difficult to 
identify in the current study, because only one 
of the ten animals that received i.v. BCG was 
not protected against the infection, making 
it hard to properly compare protected and 
unprotected animals. To gain an understand- 
ing of the potential underlying mechanism, 
Darrah and colleagues therefore compared 
the immune responses of animals vaccinated 
by the different routes. 

Compared withi.d. and aerosol vaccination, 
i.v. BCG led toa massive influx of immune cells 
called T cells into the lungs. The increased 
number of T cells was still apparent six months 
later, when the animals were exposed to 
M. tuberculosis. It is likely that this expan- 
sion occurs because i.v. injection leads to the 
delivery of a high dose of BCG to the lung — a 
hypothesis consistent with a recent study* 
showing that direct intrabronchial inoculation 
of BCGcanalso protect against M. tuberculosis. 

The authors next showed that the T cells 
recognized protein fragments called antigens 
produced by BCG. Because BCG and M. tuber- 
culosis are closely related bacteria, these T cells 
also recognize M. tuberculosis antigens. The 
Tcells that were recruited to the lung were clas- 
sified as differentiated ‘memory’ T cells onthe 
basis of their gene-expression profiles, the pro- 
teins on their surfaces and their function. These 
T cells survive long after vaccination, and, 
because they recognize the antigens produced 
by M. tuberculosis, they canbe rapidly activated 
oninfection, producing many ‘effector’ T cells, 
which combat the invading pathogen. 

Although this circumstantial evidence 
implicates T cells in immunity against M. tuber- 
culosis, the surprising efficacy of i.v. BCG 
relative to the other vaccine routes (which 
also elicit T-cell responses) suggests that other 
mechanisms of immunity are also involved. As 
Darrah et al. propose, these might involve: 
antibody responses against M. tuberculosis; 
innate immune cells, which are activated indi- 
rectly by infection (and do not require specific 
recognition of M. tuberculosis antigens); or 
innate training, a process by which immune 
cells such as macrophages gain an enhanced 
ability to protect, often nonspecifically, 
against microbes. 

Darrah and co-workers’ findings raise 
the obvious possibility of controlling TB by 
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giving people BCG byi.v. injection. In support 
of this idea, the intervention proved to be 
safe in the small cohort of rhesus macaques 
studied. But there is currently a drive to sim- 
plify vaccine deployment by eliminating the 
need for vaccines to be kept cold or for experts 
to administer them® — both of which are crucial 
fori.v. injection. 

Whether or not i.v. BCG is developed for 
clinical use, research that builds on Darrah and 
colleagues’ work could lead to an improved 
understanding of what protection against 
TB looks like — that is, to define correlates 
of protection. In addition, future work must 
delineate the mechanisms that lead to steri- 
lizing immunity after i.v. BCG. If successful, 
it might be possible to develop a vaccine 


Microbiology 


designed to activate the same protective 
immune mechanisms as those triggered by 
i.v. BCG, but that could be administered ina 
way that is safe and adaptable to mass vacci- 
nation programmes. 
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Food for thought about 
manipulating gut bacteria 


Nathalie M. Delzenne & Laure B. Bindels 


Knowing how dietary fibre nourishes gut microorganisms 
might suggest ways to boost health-promoting bacteria. A 
method developed to pinpoint bacteria that consume particular 
types of dietary fibre could advance such efforts. 


Certain gut microorganisms can boost human 
health, but it is unclear how diet could be har- 
nessed to easily manipulate the composition 
of gut microbes to boost the levels of desired 
bacteria. Writing in Cell, Patnode et al.' present 
a useful approach for assessing interactions 
between human gut microbes and the dietary 
fibre that sustains their existence. 

Dietary fibre is promoted as part of a healthy 
diet worldwide. Many people, however, do 
not achieve their recommended fibre intake 
because they consume insufficient fruit, veg- 
etables and cereals. Inadequate fibre intake is 
associated with common conditions includ- 
ing obesity, diabetes and cancer’. Yet under- 
standing the mechanisms that link fibre-rich 
food to good health is challenging. Dietary 
fibre encompasses a wide range of complex 
molecules, most of which are present in plant 
cells; among them are carbohydrate molecules 
called glycans, whichare resistant to digestion 
by human enzymes. As a consequence, some 
ingested fibre is excreted unchanged in faeces, 
whereas most is metabolized by gut microbes. 

These microbes havea diverse and extremely 
complex metabolic capacity. Bacteria that 
express different enzymes for metaboliz- 
ing fibre can survive and grow using a range 
of foods. Some bacterial species might 
compete with each other for the same food 
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source, which could lower the abundance of 
species that compete less successfully. How 
might gut microbes be manipulated through 
human dietary intervention? For example, the 
concept of using prebiotics — compounds that 
affect gut microbes, thereby benefiting the 
human host — has been proposed. One such 
idea isto use particular fibre sources that pro- 
vide food for the desired gut microbes**. How- 
ever, determining whether dietary fibre can 
promote health in this way requires a sophis- 
ticated understanding of the interactions that 
occur when the complex community of gut 
microbes encounters a source of fibre. 
Previous work? had indicated that trans- 
ferring the gut microbes of human twins who 
have contrasting body masses (obese and lean) 
into mice induced acorresponding difference 
in the animals’ body masses. However, when 
some of the obese mice were housed with 
the lean mice, they had less adipose fat than 
did obese animals that were not co-housed 
with lean mice — and this weight-loss effect 
correlated with the transfer of Bacteroides 
bacterial species from the lean mice to the 
obese mice>. High consumption of fibre-rich 
plant foods was required for this adipose-fat 
reduction to occur’. However, the types of 
fibre responsible for this effect, and howthese 
interact with specific gut microorganisms, was 
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might suggest ways to boost health-promoting bacteria. A 
method developed to pinpoint bacteria that consume particular 
types of dietary fibre could advance such efforts. 


Certain gut microorganisms can boost human 
health, but it is unclear how diet could be har- 
nessed to easily manipulate the composition 
of gut microbes to boost the levels of desired 
bacteria. Writing in Cell, Patnode et al.' present 
a useful approach for assessing interactions 
between human gut microbes and the dietary 
fibre that sustains their existence. 

Dietary fibre is promoted as part ofa healthy 
diet worldwide. Many people, however, do 
not achieve their recommended fibre intake 
because they consume insufficient fruit, veg- 
etables and cereals. Inadequate fibre intake is 
associated with common conditions includ- 
ing obesity, diabetes and cancer’. Yet under- 
standing the mechanisms that link fibre-rich 
food to good health is challenging. Dietary 
fibre encompasses a wide range of complex 
molecules, most of which are presentin plant 
cells; among them are carbohydrate molecules 
called glycans, whichare resistant to digestion 
by human enzymes. As a consequence, some 
ingested fibre is excreted unchanged in faeces, 
whereas mostis metabolized by gut microbes. 

These microbes havea diverse and extremely 
complex metabolic capacity. Bacteria that 
express different enzymes for metaboliz- 
ing fibre can survive and grow using a range 
of foods. Some bacterial species might 
compete with each other for the same food 
source, which could lower the abundance 
of species that compete less successfully. 
How might gut microbes be manipulated 
through human dietary intervention? For 
example, the concept of using prebiotics — 
compounds that affect gut microbes, thereby 
benefiting the human host — has been pro- 
posed. One suchideais to use particular fibre 
sources that provide food for the desired gut 
microbes**. However, determining whether 
dietary fibre can promote health in this way 


requires a sophisticated understanding of 
the interactions that occur when the com- 
plex community of gut microbes encounters 
asource of fibre. 

Previous work? had indicated that trans- 
ferring the gut microbes of human twins who 
have contrasting body masses (obese and lean) 
into mice induced acorresponding difference 
in the animals’ body masses. However, when 
some of the obese mice were housed with 
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the lean mice, they had less adipose fat than 
did obese animals that were not co-housed 
with lean mice — and this weight-loss effect 
correlated with the transfer of Bacteroides 
bacterial species from the lean mice to the 
obese mice®. High consumption of fibre-rich 
plant foods was required for this adipose-fat 
reduction to occur’. However, the types of 
fibre responsible for this effect, and howthese 
interact with specific gut microorganisms, was 
unknown. Patnode and colleagues now reveal 
how particular types of glycan can drive com- 
petition between different Bacteroides species 
resident in the human gut. 

Patnode etal. studied mice that lacked their 
normal microbes, and instead harboured 
15 strains of gut-dwelling bacteria from a 
lean human who had an obese twin. The 
authors fed the mice different combinations 
of fibre sources as part of their diet. Analys- 
ing faecal samples enabled the researchers 
to track how the diets affected the relative 
abundance of each bacterial species in the 
animals’ gut. This approach pinpointed, for 
example, a dose-response effect of pea fibre 
on the relative abundance of Bacteroides 
thetaiotaomicron in the bacterial popula- 
tion, as well as a pronounced effect of certain 
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Figure 1 | Investigating how human gut-dwelling bacteria metabolize dietary fibre. a, Patnode et al.’ 
gave mice that lacked their natural gut microbes a set of 15 bacterial strains that dwell inthe human gut, 
including the species Bacteroides cellulosilyticus, Bacteroides ovatus and Bacteroides vulgatus. The authors 
developed a method for tracking fibre digestion. They generated magnetic beads coated witha fibre of 
interest, and fed these beads (termed food particles) to the animals. Applying a magnetic field enabled the 
recovery of food particles and assessment of the extent of fibre degradation. The animals received food 
particles that included some coated with pea fibre that is rich in the molecule arabinan, and some coated 
with the molecule arabinoxylan. B. vulgatus and B. cellulosilyticus competed to degrade the arabinan, 

B. cellulosilyticus degraded arabinoxylan, and B. ovatus degraded other molecules (not shown). b, When 
the experiment was repeated without B. cellulosilyticus, B. ovatus demonstrated metabolic flexibility, by 
switching to degrade arabinoxylan. B. ovatus degraded less arabinoxylan than did B. cellulosilyticus. 
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types of barley fibre (B-glucan and bran) on 
the relative abundance of Bacteroides ovatus. 
These results reveal the specificity of the 
effects that different forms of dietary fibre 
can have on bacterial populations. 

To identify the genes required for a specific 
bacterium of interest to metabolize fibre, the 
authors gave mice bacterial strains that were 
engineered to contain mutations at random 
sites across their genome, and fed the animals 
different kinds of dietary fibre. By analysing 
the proteins in mouse faecal samples, the 
authors identified a set of bacterial proteins 
that allow certain microbes to grow success- 
fully in particular feeding regimes. For exam- 
ple, when mice received dietary fibre from 
fruit peelings (citrus pectin) that are rich in 
a type of molecule called methylated homo- 
galacturonan, this led to a rise in the expres- 
sion of proteins that degrade such molecules 
in the bacterium Bacteroides cellulosilyticus. 
And when mice received pea fibre, whichis rich 
ina polymer molecule called arabinan (which 
contains the sugar arabinose), the expression 
of proteins involved in arabinan degradation 
rose in the bacterium B. thetaiotaomicron. 

Perhaps the most original part of this 
research is the development of artificial ‘food 
particles’ consisting of glycan-coated mag- 
netic beads (Fig. 1) that can be administered 
orally to mice and recovered by applying a 
magnetic field. Patnode etal. used this strategy 
to investigate how bacterial species respond to 
different food sources by assessing the extent 
of glycan degradation in the recovered beads. 
When mice that had been colonized only with 
B. cellulosilyticus or Bacteroides vulgatus were 
given food particles coated with pea fibre, the 
levels of arabinose in the recovered beads were 
lower than the original levels, demonstrat- 
ing that both of these bacterial species had 
metabolized this molecule in vivo. 

In a parallel experiment, mice were 
colonized either with all 15 bacterial strains 
from the lean twin, or with 14 of the strains 
(B. cellulosilyticus excluded), before being 


2 | Nature 


given food particles containing pea fibre 
(Fig. 1). The level of degradation of arabinose 
inthe arabinan-rich pea-fibre beads was then 
compared, and was found to be the same in 
both cases. This suggests that some change 
occurs in the bacterial community, in the 
absence of B. cellulosilyticus, that enables 
arabinose from pea fibre to be degraded as 
muchas it would be if all 15 bacterial strains 
were present. The story might be different for 
other forms of dietary fibre. 

Along with the food particles coated with 
pea fibre, the animals received some coated 
with molecules of arabinoxylan (a polymer 
of the sugars arabinose and xylose). How- 
ever, in the case of arabinoxylan, the bac- 
terial strains were less able to process this 
molecule when B. cellulosilyticus was absent 
than when it was present, and the arabinox- 
ylan-metabolizing activity was attributed 
to B. ovatus. Inthe absence of B. cellulosilyticus, 
B. ovatus undergoes a metabolic shift that 
boosts its ability to use arabinoxylan. When 
both B. ovatus and B. cellulosilyticus were 
absent from the bacterial populations, arabi- 
noxylan-coated beads retained their original 
levels of arabinose, revealing that none of the 
remaining 13 bacterial strains took advantage 
of arabinoxylan availability. 

This study reveals the flexibility and 
adaptability of gut microbes in response to 
their nutritional environment. It provides a 
useful focus on specific forms of dietary fibre 
and bacterial species known to be linked to 
diet-associated resistance to arise in adipose 
tissue>. This ‘simplification’ of the context 
suggests a way forward in understanding the 
key genes and proteins of Bacteroides that are 
crucial for the degradation of dietary fibre, 
and that might affect the abundance of par- 
ticular gut bacteria. The findings also reveal 
how B. cellulosilyticus can have a dominant 
role in its interactions with certain bacteria 
with which it can compete for the same 
food source. The work also uncovers hidden 
metabolic flexibility, such as the ability of 
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B. ovatus to adapt its metabolic strategy. 

When assessing this study, it is worth 
bearing in mind that Bacteroides is not the only 
type of bacterium that commonly uses dietary 
fibre for food, and that the fibre-containing 
foods tested by the authors are not the major 
sources of dietary fibre in atypical human diet. 
Moreover, the abundance of Bacteroides varies 
enormously between people’, and the hypoth- 
esis that key Bacteroides species might affect 
the success of dieting efforts to control obesity 
requires further investigation. 

Although it concentrates on Bacteroides 
only, Patnode and colleagues’ work represents 
useful progress towards developing person- 
alized nutrition strategies for tailoring gut 
microbes in the future. The study also com- 
plements other research’ ’ that explores how 
bacteria in the human gut might contribute 
to the body’s response to a particular diet. 
Thanks to Patnode etal., we have fresh insights 
into how specific types of bacterium use and 
compete for dietary fibre. Future research 
will undoubtedly continue to refine the link 
between fibre-rich food and health, by tak- 
ing into account the role of the gut microbial 
community. 
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personalized nutrition strategies for tailor- 
ing gut microbes inthe future. The study also 
complements other research’ ° that explores 
how bacteria in the human gut might contrib- 
ute to the body’s response toa particular diet. 
Thanks to Patnode etal., we have fresh insights 
into how specific types of bacterium use and 
compete for dietary fibre. Future research 
will undoubtedly continue to refine the link 
between fibre-rich food and health, by tak- 
ing into account the role of the gut microbial 
community. 
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Infrared spectroscopy 
finally sees the light 


Andreas Barth 


The reliance of infrared spectroscopy on light transmission 
limits the sensitivity of many analytical applications. An 
approach that depends on the emission of infrared radiation 
from molecules promises to solve this problem. See p.52 


Atomsin molecules oscillate when irradiated 
by infrared light. The particular light frequen- 
cies that drive these vibrations are absorbed 
by molecules, and depend on the molecules’ 
chemical structure and environment. The 
infrared absorption spectrum of asamplecan 
therefore be used as a molecular fingerprint 
by which to characterize its chemical compo- 
sition. This has made infrared spectroscopy 
a widespread analytical technique. However, 
infrared spectra are difficult to measure 
for low concentrations of analytes and for 
samples in water. On page 52, Pupeza et al.! 
present a concept for infrared spectroscopy 
that promises to alleviate these limitations. 
Infrared light was discovered’ as a result of 
the problem it caused William Herschel while 
he was making astronomical observations 
of the Sun — it created a disturbing heating 
sensation in his eye that he wanted to filter 
out. Today, however, the benefits of infrared 
radiation for a multitude of analytical pur- 
poses are widely appreciated. Its applications 
range from the detection of molecules in outer 
space*", including that of water on Mars°, to 
deciphering the molecular mechanisms of 
proteins in living organisms®”. In the every- 
day world, it is used in food analysis®* and in 
forensic police investigations®”’, for example. 
Much research is being done to bring infra- 
red spectroscopy to the clinic, because the 
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analysis of biological tissue and body fluids 
canbeused to detect and diagnose disease*”””. 

One of the main obstacles to the infrared 
analysis of biological samples is the strong 
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absorption of infrared radiation by water —a 
problem that limits the sample thickness to 
less than 10 micrometres for most purposes. 
This issue also makes it difficult to add aqueous 
solutions of reagents (suchas acids or salts) to 
samples to manipulate the state of molecules 
inthe sample. Such manipulations are desira- 
ble, for example, for studying the binding of 
small molecules to proteins, and are stand- 
ard practice when using ultraviolet or visible 
spectroscopy. Furthermore, because infrared 
radiation is absorbed by water, samples must 
often be concentrated or dried. 

Pupeza and colleagues report a solution 
to this problem. They irradiate samples with 
an ultrashort pulse (on the scale of femto- 
seconds; 1 fs is 10° seconds) of mid-infrared 
light. Specific frequencies of the light are 
absorbed by sample molecules, generating 
vibrations. These vibrations continue after 
the pulse has ended, and last until the vibra- 
tional energy is dissipated to the environment 
(which takes a few picoseconds;1psis10™s). 
Because the vibrating atoms carry partial 
electrical charges, their oscillations gener- 
ate electromagnetic radiation, similar to the 
way in which oscillating electrons produce 
electromagnetic radiation in an antenna. The 
generated radiation has the same frequency as 
that of the molecular vibrations, and so carries 
information about all of the sample molec- 
ules — the authors therefore call it a global 
molecular fingerprint. It is measured using 
asecond ultrashort pulse of light, this time 
inthe near-infrared spectral range, througha 
method called electro-optic sampling”. 

The authors’ approach is conceptually 
different from conventional absorption 
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Figure 1| A fresh approach for obtaining infrared spectra. a, In conventional infrared spectroscopy, 
molecules are irradiated with infrared light. They absorb certain frequencies of the light, which causes them 
to vibrate. The signals of interest are the absorption ‘troughs’ in the transmitted light spectrum, but these 
change the overall intensity of the transmitted light only marginally when the samples are highly diluted, 
limiting the sensitivity of this technique. b, Pupeza et al.’ irradiate analytical samples with ultrashort bursts 
of infrared light, again causing molecules in the sample to vibrate. These vibrations continue after the 

pulse has ended, and generate infrared radiation, shown here as a ‘tail’ that trails after the pulse. This tail is 
analysed to determine the infrared spectrum of the molecules. Because the experimental signal is emitted 
light and is detected directly, this method can be more sensitive than absorption infrared spectroscopy. 
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measurements. In absorption spectroscopy, 
the signal is sensed only indirectly, from the 
light that does not interact with the sample 
(Fig. 1a). Weak absorption is therefore very 
difficult to detect, because it changes the 
intensity of the transmitted light only mar- 
ginally. Theoretically, the detection of weak 
absorbers could be improved by increasing the 
intensity of the incident light, but commonly 
used infrared detectors become less sensitive 
at higher light intensities”, imposing a practi- 
cal limit on the maximum light intensity that 
can be used. By contrast, Pupeza et al. detect 
the signal of interest — the radiation emit- 
ted from the vibrating molecules — directly 
(Fig. 1b). This is analogous to the difference 
between absorbance and fluorescence 
measurements in the visible spectral range: 
fluorescence measurements are the more 
sensitive because they detect a signal directly 
from the sample, and can even detect it from 
asingle molecule. 

Pupeza and colleagues demonstrate the 
high sensitivity of their approach in various 
ways. For example, they were able to detect 
40-fold lower concentrations of acompound 
insolution, and to better distinguish between 
two similar compounds, than when using 
absorption spectroscopy. They also obtained 
spectra of biological samples that block nearly 
all of the incoming light (in one case, at least 
99.999%). Thus, the new approach senses 
light where currently used methods see only 
darkness. This is an impressive achievement, 
and might alleviate both of the main prob- 
lems of conventional infrared spectroscopy: 
sensitivity and strong infrared absorption by 
water. It will simplify sample preparation in 
many cases by removing the need for sample 
concentration or drying, and will open up new 
applications — particularly those involving 
aqueous biological samples. 

The authors suggest several ideas for taking 
the method further, such as by increasing the 
power of the laser used toirradiate the sample. 
Itis to be hoped that such measures will further 
narrow the technological gap that at present 
prevents the method from achieving the ulti- 
mate goal of single-molecule sensitivity in bulk 
water. Other challenges will be to increase 
the spectral range of the measurements to 
include the shorter wavelengths at which 
prominent and diagnostically useful signals 
are found for proteins, lipids and nucleotides, 
and to develop a spectrometer suitable for 
commercialization at a competitive price. 
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Al shows promise for 
breast cancer screening 


Etta D. Pisano 


Could artificial intelligence improve the accuracy of 
screening for breast cancer? A comparison of the diagnostic 
performance of expert physicians and computers suggests so, 
but the clinical implications are as yet uncertain. See p.89 


Screening is used to detect breast cancer early 
in women who have no obvious signs of the 
disease. This image-analysis task is challenging 
because cancer is often hidden or masked in 
mammograms by overlapping ‘dense’ breast 
tissue. The problem has stimulated efforts to 
develop computer-based artificial-intelligence 
(Al) systems to improve diagnostic perfor- 
mance. On page 89, McKinney etal.’ report the 
development ofan Al system that outperforms 
expert radiologists in accurately interpreting 
mammograms from screening programmes. 
The workis part of a wave of studies investigat- 
ing the use of Alina range of medical-imaging 
contexts’. 

Despite some limitations, McKinney and 
colleagues’ study is impressive. Its strengths 
include the large scale of the data sets used for 
training and subsequently validating the Al 
algorithm. Mammograms for 25,856 women 
in the United Kingdom and 3,097 women in 
the United States were used to train the Al sys- 
tem. The system was then used to identify the 
presence of breast cancer in mammograms of 
women who were known to have had either 
biopsy-proven breast cancer or normal fol- 
low-up imaging results at least 365 days later. 
These outcomes are the widely accepted gold 
standard for confirming breast cancer status 
in people undergoing screening for the dis- 
ease. The authors report that the AI system 
outperformed both the historical decisions 
made by the radiologists who initially assessed 
the mammograms, and the decisions of 6 
expert radiologists who interpreted 500 ran- 
domly selected cases in a controlled study. 

McKinney and colleagues’ results suggest 
that Al might some day have a role in aiding 
the early detection of breast cancer, but the 
authors rightly note that clinical trials will 
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be needed to further assess the utility of 
this tool in medical practice. The real world 
is more complicated and potentially more 
diverse than the type of controlled research 
environment reported in this study. For exam- 
ple, the study did not include all the different 
mammography technologies currently in 
use, and most images were obtained using a 
mammography system from a single manu- 
facturer. The study included examples of two 
types of mammogram: tomosynthesis (also 
known as 3D mammography) and conven- 
tional digital (2D) mammography. It would 
be useful to know how the system performed 
individually for each technology. 


“Clinical trials will be needed 
to further assess the utility 
of this tool in medical 
practice.” 


The demographics of the population 
studied by the authors is not well defined, 
apart from by age. The performance of Al 
algorithms can be highly dependent on the 
population used inthe training sets. It is there- 
foreimportant that a representative sample of 
the general population be used in the devel- 
opment of this technology, to ensure that the 
results are broadly applicable. 

Another reason to temper excitement 
about this and similar Al studies is the lessons 
learnt from computer-aided detection (CAD) 
of breast cancer. CAD, an earlier computer 
system aimed at improving mammography 
interpretation in the clinic, showed great 
promise in experimental testing, but fell 
short in real-world settings*. CAD marks 
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mammograms to draw the interpreter’s 
attention to areas that might be abnormal. 
However, analysis of a large sample of clini- 
cal mammography interpretations from the 
US Breast Cancer Surveillance Consortium 
registry demonstrated that there was no 
improvement in diagnostic accuracy with 
CAD?. Moreover, that study revealed that the 
addition of CAD worsened sensitivity (the 
performance of radiologists in determining 
that cancer was present), thus increasing the 
likelihood ofa false negative test. CAD did not 
resultina significant change in specificity (the 
performance of radiologists in determining 
that cancer was not present) and the likelihood 
of a false positive test’. 

It has been speculated that CAD was not 
as useful in the clinic as experimental data 
suggested it might be because radiologists 
ignored or misused its input owing to the 
high frequency of marks on the images that 
were not findings suggestive of cancer. This 
outcome was attributed by some to the 
limited processing power available for CAD, 
which meant that comparisons with previous 
imaging studies of the same person were not 
possible*. Thus, CAD might mark regions that 
were not changing over time and that could be 
easily dismissed by expert readers. Another 
factor that limited CAD is that it was developed 
using the performance of human-based diag- 
nosis. It was trained using mammograms in 
which humans had found signs of cancer and 
others that were false negatives — cases in 
which humans could not see signs of cancer 
although the disease was indeed present’. 
Similar pitfalls could be encountered with 
Al-based decision aids, too. 

A system by which Al finds abnormalities 
that humans miss will require radiologists to 
adapt tothe use of these types of tool. Imagine 
asystem in which an algorithm marks a dense 
breast area on a screening mammogram and 
the human radiologist cannot see anything 
that looks potentially malignant. With CAD, 
radiologists scrutinize the areas marked, and 
if they decide the mark is probably not cancer, 
they assign the mammogram as being nega- 
tive for malignancy. However, if Al algorithms 
are to make a bigger difference than CAD in 
detecting cancers that are currently missed, 
an abnormality detected by the Al system, 
but not perceived as such by the radiologist, 
would probably require extra investigation. 
This might result in a rise in the number of 
people who receive callbacks for further eval- 
uation. A clinical trial would show the effect of 
the Al system on the detection of cancer and 
the rate of false positive diagnoses, while also 
allowing the development of effective clinical 
practice in response to mammograms flagged 
as abnormal by Al but not by the radiologist. 

In addition, it would be essential to develop 
amechanism for monitoring the performance 
of the Al system as it learns from cases it 
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encounters, as occurs in machine-learning 
algorithms. Such performance metrics would 
need to be available to those using these tools, 
in case performance deteriorates over time. 

It is sobering to consider the sheer vol- 
ume of data needed to develop and test Al 
algorithms for clinical tasks. Breast cancer 
screening is perhaps an ideal application for Al 
in medical imaging because large curated data 
sets suitable for algorithm training and test- 
ing are already available, and information for 
validating straightforward clinical end points 
is readily obtainable. Breast cancer screening 
programmes routinely measure their diagnos- 
tic performance — whether cancer is correctly 
detected (a true positive) or missed (a false 
negative). Some areas found on mammograms 
might be identified as abnormal but turn out 
on further testing not to be cancerous (false 
positives). For most women, screening iden- 
tifies no abnormalities, and when there is still 
no evidence of cancer one year later, this is 
classified as a true negative. 

Most other medical tasks have more- 
complicated clinical outcomes, however, in 
which the clinician’s decision is not a binary 
one (between the presence or absence of 
cancer), and thus further signs and symptoms 
must also be considered. In addition, most 
diseases lack readily accessible, validated 
data sets in which the ‘truth’ is defined rela- 
tively easily. Obtaining validated data sets for 
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more-complex clinical problems will require 
greater effort by readers and the develop- 
ment of tools that can interrogate electronic 
health records to identify and annotate cases 
representing specific diagnoses. 

To achieve the promise of Al in health care 
that is implied by McKinney and colleagues’ 
study, anonymized data in health records 
might thus have to be treated as precious 
resources of potential benefit to human 
health, in much the same way as public utilities 
such as drinking water are currently treated. 
Clearly, however, if such Al systems are to be 
developed and used widely, attention must 
be paid to patient privacy, and to how data are 
stored and used, by whom, and with what type 
of oversight. 


Etta D. Pisano is at the American College 

of Radiology, Philadelphia, Pennsylvania 
19103, USA, and at Beth Israel Lahey Medical 
Center, Harvard Medical School, Boston, 
Massachusetts. 

e-mail: episano@bidmc.harvard.edu 


1. McKinney, S. M. et al. Nature 577, 89-94 (2020). 

2. Neri, E. et al. Insights Imaging 10, 44 (2019). 

3. Lehman, C. D. et al. JAMA Intern. Med. 175, 1828-1837 
(2015). 

4. Kohli, A. & Jha, S. J. Am. Coll. Radiol. 15, 535-537 (2018). 


Galaxy cluster illuminates 
the cosmic dark ages 


Nina A. Hatch 


Observations of a distant cluster of galaxies suggest that 

star formation began there only 370 million years after the 

Big Bang. The results provide key details about where and when 
the first stars and galaxies emerged in the Universe. See p.39 


Shortly after the Big Bang, the Universe was 
completely dark. Stars and galaxies, which 
provide the Universe with light, had not yet 
formed, and the Universe consisted of a pri- 
mordial soup of neutral hydrogen and helium 
atoms and invisible ‘dark matter’. During 
these cosmic dark ages, which lasted for 
several hundred million years, the first stars 
and galaxies emerged. Unfortunately, obser- 
vations of this era are challenging because 
dark-age galaxies are exceptionally faint’. On 
page 39, Willis etal.” provide a glimpse of what 
happened during the dark ages by doing some 
galactic archaeology. By measuring the ages 
of stars in one of the most distant clusters of 
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galaxies known, the authors located galaxies 
that formed stars in the dark ages, close tothe 
earliest possible time that stars could emerge. 

A galaxy cluster is a group of thousands 
of galaxies that orbit each other at speeds? 
of about 1,000 kilometres per second. They 
are prevented from flying apart by the grav- 
itational pull of the accompanying dark 
matter, which has the equivalent total mass 
of about one hundred trillion Suns*. Astron- 
omers use these clusters as laboratories for 
many experiments in astrophysics, such as 
measuring the composition of the Universe, 
testing theories of gravity and determining 
how galaxies form. Willis et al. used one of the 
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Figure 1| Chronology of the Universe. After the Big Bang, the Universe consisted of a cosmic soup of 
radiation and matter. About 400,000 years later, it entered an era known as the cosmic dark ages in 

which it was devoid of light. The first stars and galaxies began to emerge a few hundred million years 

later, and gradually provided the Universe with light. Willis et al.? report that star formation ina distant 
cluster of galaxies began roughly 370 million years after the Big Bang. The light that we see from this 
galaxy cluster was emitted when the Universe was about 3.3 billion years old. The cluster is likely to have 
become one of the largest structures in the present-day Universe, comparable in mass to the Coma cluster. 
(Image credits: Willis and colleagues’ galaxy cluster: N. A. Hatch; Coma cluster: Russ Carroll, Rob Gendler, 
Bob Franke/Dan Zowada Memorial Observatory, Wayne State Univ.) 


most distant clusters known to study when the 
most massive galaxies in the Universe began 
to produce stars. 

Although nearby clusters, suchas the Coma 
cluster, are easier to observe than those far- 
ther away, we cannot measure their ages 
precisely because the galaxies are extremely 
old. It is difficult to differentiate between, for 
example, a galaxy that is 7 billion years old and 
one that is 13 billion years old. Therefore, to 
obtain a precise date for when clusters first 
formed their stars, Willis and colleagues used 
NASA’s Hubble Space Telescope to look at one 
of the most distant clusters they could find. 

Because light travels at a finite speed, the 
most distant clusters we can see are also 
those in the earliest stages of the Universe 
that we can see. The light from the cluster 
examined by Willis et al. has been travelling 
for 10.4 billion years before it reaches Earth, 
which means that we are looking at a cluster as 
it was just 3.3 billion years after the Big Bang. 
Consequently, this cluster acts as a keyhole 


through which we can peer into the early 
Universe (Fig. 1). 

Willis and colleagues found that the cluster 
contains several galaxies that have similar red 
colours. The colour of a galaxy can be used to 
estimate its age because younger stars are 
bluer than their older, redder counterparts. As 


aresult, galaxies that have red colours formed 
their stars along time ago®. By comparing the 
colours of the cluster galaxies with those of 
models, the authors estimated that the stars 
of these galaxies started to emerge when the 
Universe was only 370 million years old. This 
epochis when we expect the first stars to have 
formed in the cosmic dark ages®. 


© 2020 Springer Nature Limited. All rights reserved. 


One particularly intriguing point is that 
Willis et al. identified at least 19 galaxies in 
the cluster that have similar colours, which 
means that the galaxies have similar ages. At 
the time when these galaxies formed their 
stars, they would have been well spread out, 
so it isa conundrum as to why they all began 
producing stars at approximately the same 
time. Were they influenced by their environ- 
ment? Alternatively, did the star formationin 
one galaxy somehow trigger a chain reaction, 
leading to star formation in nearby gas clouds? 
We donot currently have the answer, but what 
is clear from the authors’ work is that these 
distant clusters are full of the oldest galaxies 
in the Universe. 

In my opinion, Willis and colleagues’ age 
estimates are the best ones possible, given 
the limited data that the authors have from 
the Hubble telescope. However, determining 
ages from the colours of galaxies is a relatively 
crude method that is subject to large uncer- 
tainties. For example, a young galaxy that 
contains a lot of astronomical dust can have 
the same colour as an old galaxy containing 
little dust. Therefore, although the authors’ 
results are tantalizing, they should be treated 
with caution until NASA’s James Webb Space 
Telescope (JWST) is launched in the next 
few years. 

The JWST will measure spectra of the light 
emitted by these galaxies. A comparison of 
the spectra with models will be a much more 
accurate way to determine the ages of the stars 
than using the colours of galaxies. Further- 
more, because it is easier to measure the ages 
of earlier galaxies than those of more recent 
ones’, it makes sense to target galaxies in the 
progenitors of these galaxy clusters in the 
early Universe. Willis and colleagues’ results 
make a strong case for these distant clusters 
being some of the first targets that the JWST 
should observe. 
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Galaxy clusters are the most massive virialized structures in the Universe and are 
formed through the gravitational accretion of matter over cosmic time’. The 
discovery’ of an evolved galaxy cluster at redshift z= 2, corresponding to alook-back 
time of 10.4 billion years, provides an opportunity to study its properties. The galaxy 
cluster XLSSC 122 was originally detected as a faint, extended X-ray source in the XMM 
Large Scale Structure survey and was revealed to be coincident with a compact over- 
density of galaxies? with photometric redshifts of 1.9 + 0.2. Subsequent observations? 
at millimetre wavelengths detected a Sunyaev—Zel’dovich decrement along the line of 
sight to XLSSC 122, thus confirming the existence of hot intracluster gas, while deep 
imaging spectroscopy from the European Space Agency’s X-ray Multi-Mirror Mission 
(XMM-Newton) revealed‘ an extended, X-ray-bright gaseous atmosphere with a virial 
temperature of 60 million Kelvin, enriched with metals to the same extent as are local 
clusters. Here we report optical spectroscopic observations of XLSSC 122 and identify 


37 member galaxies at a mean redshift of 1.98, corresponding to a look-back time of 
10.4 billion years. We use photometry to determine a mean, dust-free stellar age of 
2.98 billion years, indicating that star formation commenced in these galaxies at a 
mean redshift of 12, when the Universe was only 370 million years old. The full range of 
inferred formation redshifts, including the effects of dust, covers the interval from 7 
to 13. These observations confirm that XLSSC 122 is a remarkably mature galaxy 
cluster with both evolved stellar populations in the member galaxies and a hot, metal- 
rich gas composing the intracluster medium. 


To further our understanding of this galaxy cluster, particularly the 
properties of its member galaxies, we undertook a series of observa- 
tions of XLSSC 122 with the Hubble Space telescope (HST) Wide Field 
Camera 3 (WFC3). We obtained images of the cluster in two wavebands, 
F105W and F140W, and performed low-spectral-resolution slitless 
spectroscopy using the G141 grism (see Methods). These observations 
cover the observed frame wavelength interval 1.0-1.7 um, correspond- 
ing to an interval of 0.33 um to 0.57 pm in the rest frame of a galaxy 
at redshift z= 2. Figure 1 displays the F140W image of XLSSC 122 and 
shows acompact cluster of galaxies associated with the extended X-ray- 
emitting region. 

We extracted one-dimensional spectra of all galaxies identified within 
the dispersed G141 grism image of the field (see Methods) and com- 
puted redshifts using a galaxy template-fitting algorithm with redshift 
as a free parameter. Figure 2 displays the histogram of galaxy redshifts 
inthe field of XLSSC 122 over the restricted interval 1.9 <z< 2.05. Inspec- 
tion of this interval reveals a primary peak at z= 1.98 associated with 
the central, red galaxies closest to the X-ray peak, and a secondary 
redshift peak at z= 1.93 associated with a mixture of red and blue gal- 
axies, located at larger projected cluster-centric distances (see Fig. 1). 


The line-of-sight separation between z=1.93 and z=1.98 is 76 co-moving 
megaparsecs, far larger than the size of the XLSSC 122 cluster, and 
the two structures are therefore physically distinct. As outlined in the 
Methods, we identify 37 galaxies as being members of the cluster anda 
further 13 galaxies identified as members of the foreground structure. 

We performed photometry of all galaxies within the HST field of view 
in both the FIOSW and F140W images (see Methods) and summarize 
this information in Fig. 3. The galaxies identified at 1.9 <z<2.05forma 
clear bimodal distribution in colour witha well populated sequence of 
red galaxies (corresponding to larger values of FIOSW - F140W) clearly 
separated froma broader distribution of blue galaxies. 

Interpreting this red sequence as representing a restricted locus of 
star-formation histories, Bower, Lucey and Ellis* employed B — V (the 
astronomical magnitude difference between a blue and a visual filter) 
photometry of red-sequence galaxies in the Virgo and Coma clusters to 
constrain the dispersion of stellar ages in their member galaxies. The 
F105W and F140W photometry obtained for XLSSC 122 at z= 2 spans 
almost exactly the age-sensitive break feature at wavelength 4,000A 
inthe member galaxy rest-frame spectral energy distributions (SEDs) 
and permits a similar analysis. 
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Fig. 1| HST image of the galaxy cluster XLSSC 122. The greyscale is the F140W 
image. Contours display X-ray emission corresponding to the 100-ks XMM- 
Newton image presented in ref. *. The dashed circle is drawn witha radius equal 
to the measured value of rso9 (the radius within which the average matter 
density is 500 times the critical density of the Universe). Spectroscopic ‘gold’ 
and ‘silver’ members (see Methods) of the z= 1.98 cluster are indicated by red 
and green circles, respectively. Members of the z=1.93 foreground structure 
are indicated by blue circles. See text for further details. 


We therefore employed the F105W and F140W photometry of red- 
sequence galaxies to constrain the posterior distributions of luminos- 
ity-weighted stellar age and stellar mass for a set of synthetic stellar 
population models (see Methods). Computing the product of these 
posterior distributions generates a mean posterior onthe luminosity- 
weighted stellar age of the red-sequence cluster members (Fig. 4), the 
details of which are presented in Table 1. 
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Fig. 2| Redshift histogram of all galaxies along the line of sight to XLSSC 122. 
The histogram considers galaxies satisfying the magnitude measurement 
F140Wx,on < 24. Galaxies classified as ‘gold’ members of the z=1.98 cluster are 
shown in red, ‘silver’ members are shown in green and members of thez=1.93 
structure are shown in blue. Galaxies not classified as amember of either the 
z=1.98 cluster or the z=1.93 structure are shown in grey. The vertical dashed 
lines show the unweighted mean redshift of both the cluster and the 
foreground structure (see text for further details). 
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Fig. 3 | Colour-magnitude diagram ofall galaxies within the HST/WFC3 field 
of view. Spectroscopically confirmed z=1.98 ‘gold’ and ‘silver’ cluster members 
are indicated as red squares and green triangles, respectively. Members of the 
z=1.93 structure are indicated as blue squares. Galaxies at z= 2 yet which are not 
formal cluster members are shownas solid black squares, whereas potentially 
contaminated or confused spectroscopic sources are shownas open black 
squares. Galaxies with visually classified emission lines are marked using black 
circles (only z=1.98 and z=1.93 are marked in this manner). All other galaxies in 
the field are indicated by grey squares. Error bars indicate the 1-sigma 
measurement uncertainty. The spectroscopic completeness limits of 

F140W gon = 24 and 24.5 are indicated by the vertical dashed and dotted lines, 
respectively. The horizontal dot-dashed line shows the lower colour limit fora 
source to be considered onthe cluster’s red sequence. The angled solid line 
indicates a simple least-squares fit to the colour-magnitude relation for red- 
sequence cluster members. Subscript ‘ap’ indicates that the magnitudes of 
these objects are measured within an aperture of fixed angular size as opposed 
toa flexible aperture, indicated by subscript ‘Kron’. 


Our analysis assumes that the tight correlation of colour onthe red 
sequence arises from scatter in age at fixed metallicity and internal dust 
absorption. Assuming no dust absorption (A, = 0.0) we determine a 
mean red-sequence luminosity-weighted stellar age of 2.98 billion years 
(Gyr), corresponding to aredshift marking the onset of star formation 
of 12. This value is consistent with the inferred formation redshifts of 
the earliest observations of star formation in the Universe®. We also 
consider dust absorption characterized by A, = 0.3 and A, =0.5, which 
generate lower mean stellar ages and greater dispersion of the mean 
age. Despite the uncertainties that govern anumber of the assumptions 
insuch stellar population analyses, the main conclusion of this analysis 
is that red-sequence galaxies in XLSSC 122 are composed of stars of 
uniformly old age. When combined with the already large look-back 
time to this cluster, it is clear that star formation occurred in these 
galaxies in a coordinated manner at early times. 


Table 1 | The mean luminosity-weighted stellar ages of red- 
sequence cluster galaxies 


SED model A, Meant, (Gyr) Mean formation redshift 
(spread) 

0.0 2.98 + 0.05 12.0 (10.9-13.3) 

0.3 2.77 + O13 8.7 (8.3-10.4) 

0.5 2.63 +0.11 74 (6.6-8.3) 


The table lists the mean and standard deviation of the computed luminosity-weighted stellar 
age t,, for SED models of specified Ay. Corresponding values of the mean formation redshift 
and spread (from standard deviation) are computed for the assumed cosmological model. 
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Fig. 4| The luminosity-weighted age distribution of stars within red- 
sequence cluster galaxies. The lines depict mean t,, posteriors for 19 ‘gold’ 
z=1.98 cluster red-sequence members for each of the three SED models 
characterized by A, = 0.0 (solid), Ay=0.3 (dashed) and A, = 0.5 (dotted). The 
vertical dashed line indicates the age of the Universe at z= 1.98 for the assumed 
cosmological model. 


Analysis of the X-ray-emitting gas in XLSSC 122 provides an alterna- 
tive perspective on the formation history of the galaxy cluster as a 
whole. Using standard theory’, one may combine the X-ray gas tem- 
perature (k,7=5 keV, where k, is the Boltzmann constant) and an esti- 
mate of the virial radius of the cluster (1,5) = 440 kpc, ref. *) to obtain 
a sound-crossing time for XLCCS 122 of 3.3 x 108 yr. Hydrodynamical 
simulations of the gas physics ina forming cluster indicate that struc- 
tures typically achieve virial equilibrium following a minimum of 2 to 
3 sound-crossing timescales’. This indicates that XLSSC 122 is unlikely 
to have assembled earlier than 1 Gyr before the epoch of observation, 
equivalent toa redshift of 2.8. Although this argument does not place 
an upper limit on the elapsed time between assembly and virializa- 
tion, the assembly of a cluster of mass equal to XLSSC 122 at redshifts 
greater than 3 appears unlikely”. It is clear, therefore, even allowing 
for the uncertainty present in these estimates, that coordinated star 
formation in the member galaxies located in XLSSC 122 preceded the 
assembly of the cluster environment. 

Computer simulations of the accretion history of massive, gravita- 
tionally bound halos in an expanding Universe indicate that it is likely 
that XLSSC 122 will evolve with time into a present-day galaxy clus- 
ter comparable in mass to that of Coma, that is, about 1 x 10% solar 
masses” “. Although caution is required in both the interpretation of 
the scatter in the accretion histories of halos of fixed total mass and 
the more subtle point of whether XLSSC 122 represents a typical galaxy 


cluster at z=2 or is perhaps an extreme case, the conclusion remains 
robust that this system will continue to grow in mass until it becomes 
a massive galaxy cluster in the present-day Universe. 

The recent discovery of SPT2349-56, a massive proto-cluster of galax- 
ies at a redshift of 4.3 (ref. ”), provides a further, tantalising, glimpse 
of the kind of structure from which XLSSC 122 may have evolved. The 
same structure growth simulations that predict the future evolution 
of massive halos can also be used to infer their likely past accretion 
histories. Even taking into account the caveats expressed above, such 
simulations indicate that structures such as SPT2349, XLSSC 122 and 
Coma may represent similar clusters viewed at very different cosmic 
epochs. From such studies we are beginning to achieve a coherent view 
of the formation and evolution of the largest gravitationally bound 
structures in the Universe. 
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Methods 


We assume a Lambda cold dark matter (CDM) cosmological 
model described by the parameters Q,, = 0.286, Q, = 0.714, Hy = 
69.6 kms Mpc‘ (ref. *). The present-day age of the Universe in this 
model is 13.72 Gyr. All magnitude information is presented using 
the AB system. 

The HST observations were obtained between 4 November 2017 
and 13 January 2018 and comprised one orbit in F105W and 12 orbits 
inthe F140W+G141 filter and grism combination. The 12 F140W+G141 
orbits were split into three orbits at each of four orientations using 
an ABB BBA pattern for exposures in each orbit. The total exposure 
times in F105W, F140W and G141 were, respectively, 2,612 s, 5,171s 
and 26,541s. 

The imaging and spectroscopic observations were reduced with 
Grizli version 0.3.0 (ref. *). Raw HST data products were processed 
by applying standard image-calibration techniques with additional 
corrections applied for variable backgrounds (the HST reduction pipe- 
line calwf3 assumes a constant background not appropriate for WFC3 
infrared observations) and to mask artefacts such as satellite trail fea- 
tures>”°, Relative and absolute astrometric registration was achieved 
by aligning to reference sources in the Sloan Digital Sky Survey. The 
final steps included flat fielding and master background subtraction 
for both the direct and grism images, and drizzling of the individual 
data frames to produce stacked images. 

Reduced data were processed with SExtractor (version 2.5.0; 
www.astromatic.net/software/sextractor) to generate photometric 
catalogues. The F140W image was processed using standard WFC3 zero 
point information with the gain parameter set to the image exposure 
time. Source detection used a pixel-based inverse variance weighting 
(pipeline IVM file in SExtractor), whereas source photometry employed 
aroot mean square (pipeline RMS file in SExtractor) variation per pixel 
weight. The FIOSW image was processed employing the SExtractor 
two-image mode with the F140W image used as the detection image. 
Source fluxes and AB magnitudes were computed within two aper- 
tures: a 0.8-arcsecond circular aperture’ and an elliptical aperture 
based upon the Kron radius (a statistical moment computed from the 
surface brightness distribution in each object) with the Kron factor 
set to k=0.8 to avoid excessive source blending in the central cluster 
regions. Sources witha half-light radius of <O0.22 arcseconds were clas- 
sified as stellar. In the following analysis we consider sources brighter 
than F140W = 25.5, corresponding to an image signal-to-noise ratio 
(SNR) >10. 

Spectral extraction from the G141 images employed the FI40W 
segmentation map produced by SExtractor (see above) to identify 
undispersed source positions. These source positions were then 
employed to construct a full field contamination model of each G141 
image. The contamination model initially assumes a spectrally flat 
continuum for all sources brighter than 25th magnitude in F140W. 
This provides a first-pass estimate of those pixels contaminated by 
spectra from more than one source. Spectral traces, represented 
by 2nd-order polynomial functions, were fitted to all of the above 
bright sources in each exposure at each orientation. Extracted spec- 
tra for these sources were then employed to compute a second-pass 
contamination model. The model was further refined for 26 bright 
objects, which were identified as contaminating the spectra of bright 
red-sequence galaxies. In these cases synthetic stellar population 
models were fitted to the contaminating spectra and these updated 
spectral models were propagated to the global contamination model. 
Employing the above procedures, and with the G141 observations 
split into four orientations, we were able to obtain a satisfactory 
contamination model for most sources even in such a densely 
packed field. 

Two-dimensional spectra were extracted separately for each G141 
exposure and resulted in a maximum of 48 spectral extractions per 


source. These spectra were optimally extracted” and simultaneously 
fitted with a suite of galaxy templates”’. The templates were steppedin 
redshift over a coarse (Az=0.01) grid fromz=0.2-4.0 and subsequently 
refitted over a fine grid (Az= 0.0004) in redshift around peaks in the 
probability distribution function. 

We define the SNR of each spectrum as the average spectral flux 
per pixel divided by the pipeline-computed noise per pixel integrated 
over the wavelength interval 1.3-1.55 um. A galaxy of brightness 
F140W on = 24 typically generates a spectrum of SNR=5 witha scatter 
consistent with random noise. We inspected visually all spectra dis- 
playing a spectral SNR >2 to assess the reliability of the fitted redshift 
and template model. We concluded that all spectra displaying SNR=5 
possess a visually reliable redshift measurement and consequently 
we employ F140W,,,,, = 24 as the galaxy brightness corresponding 
to our spectroscopic completeness limit. Furthermore, we deter- 
mined that sources with visually identified emission lines possess a 
reliable redshift to a limit of SNR = 3, corresponding to a brightness 
F140W on = 24-5, which we adopt for our spectroscopic completeness 
limit for emission line sources. Extended Data Fig. shows two examples 
of extracted grism spectra. 

Galaxy membership of the z=1.98 cluster was defined according to 
anumber of criteria that we describe below. We define ‘gold’ members 
as those displaying F140W,,,, = 24 (24.5 for emission line sources) 
and Prem > 0.5 where P,,em is defined as the integral of the redshift 
probability distribution function for each galaxy over the interval 
1.96 <z<2.00. This interval corresponds to Zeiuster + 30, where o, is the 
observed frame velocity dispersion of a5-keV galaxy cluster expressed 
in redshift space”. There are 33 galaxies in this class (of which four 
are emission line sources with 24 < F140W,,o, < 24.5). We compute the 
redshift of XLSSC 122 as the unweighted mean of the ‘gold’ cluster 
member redshifts. The redshift is z= 1.978 + 0.010. We define ‘silver’ 
members as those displaying 0.1< Prem < 0.5—a change that adds four 
new members (one of which is on the red sequence)-—for a total of 
37. Finally, we create an additional class to identify members of the 
z=1.93 foreground structure as those displaying P’ mem > 0.5, where 
P’ memis defined as the integral of the redshift probability distribution 
function for each galaxy over the interval 1.91<z<1.95 with the same 
brightness limits as before. There are 13 galaxies in this class. We com- 
pute the redshift of this structure as the unweighted mean redshift of 
these 13 galaxies. The resulting structure redshift is z= 1.934 + 0.007. 
This analysis therefore identifies a total of 50 galaxies that are mem- 
bers of either XLSSC 122 or the z=1.93 structure (see Extended Data 
Table 1 and Fig. 1 for these members plotted on the greyscale HST/ 
WFC3 image). 

Figure 3 shows the colour-magnitude diagram for all galaxies identi- 
fied within the HST field. We identify a total of 30 red-sequence mem- 
bers according to1.15 < F105W,, — F140W,, < 1.65 and F140W on < 24. Of 
these, 19 are defined as ‘gold’ cluster members as described above. Of 
the remaining 11 galaxies, one isa silver cluster member, one is located 
at z~2 witha relatively broad redshift probability distribution function, 
five are located within the z= 1.93 structure and four are located atz>2 
yet display spectra affected by source confusion and contamination. 
We restrict our subsequent red-sequence analysis to the 19 ‘gold’ cluster 
members. Anunweighted, linear, least-squares fit to the red-sequence 
members generates the angled dotted line shown in Fig. 3. The root- 
mean-square deviation in colour about this line normalized by the 
photometric error is 1.72, that is, the observed scatter is 72% larger 
than expected from the computed colour errors. 

At redshifts z<1, the dominant populations of evolved, red galax- 
ies are interpreted to be the result of the prompt suppression of star 
formation within galaxies accreting into the cluster environment”. 
The details of this process, euphemistically referred to as ‘quench- 
ing’, remain uncertain, with likely physical scenarios including the 
ram pressure stripping of gas from galaxies falling through the hot, 
X-ray-emitting, intra-cluster medium”*”*. The exact mass scale at 


which quenching occurs is also debatable, with uncertainty as to 
whether the suppression of star formation occurs as galaxies are 
accreted into less massive groups before encountering more mas- 
sive clusters. With a quenched fraction of 0.51 + 0.14 at a look-back 
time of 10.4 Gyr for XLSSC 122, it is clear that the physical processes 
involved in quenching were established at an even earlier cosmic 
epoch. 

We employ the F105W and F140W photometry of red-sequence galax- 
ies to constrain the stellar age, star formation rate and stellar mass of 
aset of synthetic stellar population models. The analysis presented in 
this paper intentionally follows that performed by Andreonetal.”° and 
Newman et al.” of galaxies within the cluster JKCS 041 at z=1.8. This 
was done in order to allowas direct acomparison as possible between 
galaxy populations in two high redshift clusters, albeit observed in 
different photometric filters. 

We employ agrid of simple stellar population models” to generate 
synthetic F105W and F140W photometry. The likelihood of the model 
photometry given the data is expressed as L = exp(—y?/2), where: 
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and D,; represents the measured apparent magnitude in the ith filter, 
M, is the apparent magnitude computed from the stellar popula- 
tion model and o, is the uncertainty in the measured magnitude. 
The stellar population models are characterized by an exponen- 
tially declining burst of star formation where the star formation 
rate SFR « exp(—¢/t). The variable t denotes the time since the burst 
commenced and ris the e-folding time. Models are further character- 
ized by a Salpeter”? initial mass function and solar metallicity. Gas 
lost during stellar evolution is not recycled and the effects of dust 
are included by applying a Calzetti attenuation law” parameterized 
by the extinction parameter Ay at 5,500 A. The stellar population 
model photometry is normalized per unit stellar mass and is scaled 
by atotal stellar mass variable, M,,,,- 

The stellar population model grid spans 8 < log[t (yr)] < 9.7 and 
8 <log[t (yr)] < 9.7. We compute posterior distributions in logt, logr 
and logM,,.., employing a Markov chain Monte Carlo algorithm and 
assuming flat priors. We do not explore the A, posterior explicitly at 
this stage. Instead we compute the posterior distributions of the above 
variables at three explicit values of Ay (0.0, 0.3 and 0.5). Finally, we 
compute the average luminosity-weighted stellar population age, fol- 
lowing refs. 7°, as: 
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The posterior distributions of t, and M,,,,are displayed in Extended Data 
Fig. 2 for the 19 ‘gold’ cluster members. Extended Data Fig. 3 displays the 
one-dimensional posterior distribution in ¢, for each cluster member 
having marginalized over M,,,,. In addition, Fig. 4 compares the aver- 
age t, posterior for all cluster members, computed as the product of 
individual posteriors, for each of the three dust models, Ay = 0.0, 0.3 
and 0.5. Values of mean luminosity-weighted stellar age and standard 
deviation are listed in Table 1. For the canonical model employing zero 
dust absorption the mean stellar age of 2.98 Gyr at z= 1.98 corresponds 
toamean star formation redshift of z=12.0. 

The spread of stellar age values in XLSSC 122 overlaps with those 
determined for the galaxy cluster JKCS 041 at z=1.803 (ref. °), yet the 
mean stellar age in XLSSC 122 is older, even though the Universe is 
0.32 Gyr younger at z= 1.98 compared to z= 1.8. Although this com- 
parison employs the same analysis methodology, the two clusters 
are observed using different photometric filters, while the clusters 
themselves may represent very different structures. It is instructive 
therefore to further compare our results to the study of Strazzullo 


etal.” who also analysed HST WFC3 F1O5W and F140W photometry for 
the z=2 cluster CL1449+0856*. Applying a stellar population model 
characterized by a short (0.25 Gyr) burst of metal-rich (150% the solar 
value) star formation, they obtain a typical formation redshift of 3to5 
for galaxies of similar colour and redshift to those analysed in XLSSC 
122. Applying a similar model to the data for XLSSC 122, we obtain a 
typical stellar population age of 1.4 Gyr, corresponding to forma- 
tion redshift of 3.3, in agreement with ref. *. Ultimately, we consider 
the assumption of a short, metal-rich burst of star formation to be 
unnecessarily restrictive given the considerable uncertainty regard- 
ing the exact physical state of these high-redshift stellar populations 
and adopta more flexible approach as outlined in this paper. Overall 
however, the comparison is instructive because it highlights the key 
influence of the assumptions governing the stellar population model 
upon the inferred formation redshift of the luminosity weighted 
stellar content of the cluster member galaxies. The acquisition of 
further data, in particular concerning the dust and metal content of 
the member galaxies of these high-redshift clusters provides a clear 
observational route to resolving such issues. We therefore emphasize 
in conclusion that the results of such stellar population modelling, 
when based upon broad-band photometry, are most conservatively 
interpreted as indicating the range of physically reasonable input 
parameters and not as indicating a definitive physical state of the 
stellar population. 


Data availability 
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bars. Error bars indicate the 1-sigma measurement uncertainty. The vertical 
dashed lines show the observed frame location of [O 11] 3,727 A, HB 4,861A and 
[O11] 5,007 Aat a redshift of 1.963. 


Extended Data Fig. 1| Example spectra of two member galaxies of XLSSC 122. 
a, The brightest cluster galaxy (ID 526) as the black line with error bars with the 
best-fitting, redshifted galaxy template shown in red (see Methods). b, A fainter 
cluster member with strong emission lines (ID 1141) as the black line with error 
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Extended Data Fig. 2| The luminosity-weighted stellar age versus the mass of the posterior probability for each galaxy. The horizontal dashed line 
of red sequence cluster member galaxies. Posterior distributions in mean indicates an age of 3.35 Gyr, that is, the age of the Universe at a redshift z=1.98 
stellar age (¢,) and log stellar mass for the 19 ‘gold’ members of the cluster red inthe assumed cosmological model. 
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Extended Data Fig. 3| The luminosity-weighted stellar age distributions for Data Fig. 2. In each panel the solid, dashed and dotted curves display, 
red-sequence cluster member galaxies. Panels show posterior distributions respectively, SED models characterized by A, = 0.0, 0.3 and 0.5. The vertical 
int, for each ‘gold’ member galaxy of XLSSC 122, having marginalized over dashed line in each panel indicates the age of the Universe at z=1.98. 

Meta. For convenience, the same colour scheme is employed as in Extended 
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Extended Data Table 1| Measured properties of confirmed members of XLSSC 122 and z=1.93 structure 


ID Right Ascension Decl. Magnitude Colour Redshift Notes 
(deg.) (deg.) 
526 34.43422 -3.75880 20.64 1.44 1.980 G 
451 34.42228 -3.76351 21.95 1.29 1.981 G 
657 34.43410 -3.75766 21.67 1.49 1.983 G 
1032 34.43245 -3.74992 22.38 1.33 1.982 G 
295 34.43503 -3.76795 22.50 1.56 1.987 G 
917 34.43563 -3.75314 22.73 0.43 1.963 G 
298 34.44715 -3.76801 22.52 1.42 1.993 G 
1050 34.43689 -3.75017 22.85 1.37 1.977 G 
1064 34.43592 -3.74954 22.34 1.35 1.988 G 
606 34.43845 -3.76070 22.99 1.28 1.966 G 
240 34.42242 -3.77000 22.62 1.20 1.977 G 
845 34.43470 -3.75489 23.38 1.38 1.979 G 
372 34.44410 -3.76567 23.08 0.66 1.963 G 
734 34.42501 -3.75803 23.39 1.43 1.996 G 
1220 34.44335 -3.74500 23.49 1.31 1.976 G 
345 34.44185 -3.76667 23.56 1.27 1.991 G 
145 34.44478 -3.77286 22.90 0.48 1.981 G 
493 34.43300 -3.76318 23.58 1.38 1.962 G 
603 34.43939 -3.76030 23.38 0.46 1.979 G 
1141 34.43362 -3.74775 23.85 0.47 1.963 G 
402 34.44641 -3.76532 23.05 0.66 1.972 G 
730 34.43975 -3.75826 23.97 1.61 1.993 G 
649 34.43396 -3.75927 22.44 1.54 1.997 G 
726 34.43060 -3.75762 23.46 1.28 1.969 G 
452 34.41895 -3.76387 23.69 0.45 1.971 G 
806 34.44771 -3.75609 23.78 1.02 1.981 G 
236 34.45158 -3.77029 23.02 N/A 1.977 G 
547 34.43527 -3.76248 23.96 0.28 1.963 G 
428 34.44661 -3.76447 23.89 1.38 1.977 G 
466 34.41865 -3.76372 24.11 0.49 1.979 GE 
229 34.44051 -3.77036 24.01 0.46 1.978 GE 
329 34.42761 -3.76741 24.38 0.30 1.972 GE 
263 34.42106 -3.76924 24.25 0.50 1.976 GE 
642 34.43380 -3.75881 22.40 1.30 2.041 S 
1253 34.44633 -3.74360 23.72 0.39 2.018 S 
1125 34.43874 -3.74822 24.41 0.62 2.000 SE 
522 34.41896 -3.76281 24.39 0.81 1.959 SE 
462 34.41950 -3.76258 21.41 1.34 1.930 F 
662 34.42184 -3.75893 22.09 0.77 1.943 F 
574 34.42053 -3.76141 22.46 1.43 1.931 F 
514 34.42648 -3.76251 22.69 1.32 1.935 F 
483 34.41598 -3.76315 22.12 1.38 1.930 F 
631 34.41783 -3.76003 22.79 0.45 1.933 F 
607 34.44899 -3.76058 23.36 0.89 1.941 F 
598 34.41724 -3.76130 23.59 1.42 1.939 F 
623 34.45558 -3.76030 22.96 N/A 1.948 F 
5 34.43687 -3.78074 22.74 N/A 1.931 F 
954 34.45186 -3.75109 23.57 0.32 1.927 F 
181 34.42259 -3.77175 23.65 0.33 1.936 F 
750 34.45794 -3.75725 22.29 N/A 1.923 F 


Magnitudes are measured using the F140W filter and employ a Kron-type aperture. Colours are expressed as F105W - FI40W magnitudes and are measured in 0.8-arcsecond circular apertures. 
The notes refer to gold (G) and silver (S) cluster members in addition to galaxies located in the foreground (F) structure; E refers to an emission line galaxy. Decl., declination. ID numbers are 
output from SExtractor. 
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Moiré lattices consist of two superimposed identical periodic structures witha 
relative rotation angle. Moiré lattices have several applications in everyday life, 
including artistic design, the textile industry, architecture, image processing, 
metrology and interferometry. For scientific studies, they have been produced using 
coupled graphene-hexagonal boron nitride monolayers’”, graphene-graphene 
layers** and graphene quasicrystals ona silicon carbide surface®. The recent surge of 
interest in moiré lattices arises from the possibility of exploring many salient physical 
phenomena in such systems; examples include commensurable-incommensurable 
transitions and topological defects’, the emergence of insulating states owing to band 


flattening**®, unconventional superconductivity* controlled by the rotation angle’, 
the quantum Hall effect’, the realization of non-Abelian gauge potentials” and the 
appearance of quasicrystals at special rotation angles”. A fundamental question that 
remains unexplored concerns the evolution of waves in the potentials defined by 
moiré lattices. Here we experimentally create two-dimensional photonic moiré 
lattices, which—unlike their material counterparts—have readily controllable 
parameters and symmetry, allowing us to explore transitions between structures with 
fundamentally different geometries (periodic, general aperiodic and quasicrystal). 
We observe localization of light in deterministic linear lattices that is based on flat- 
band physics®, in contrast to previous schemes based on light diffusion in optical 
quasicrystals”, where disorder is required” for the onset of Anderson localization“ 
(thatis, wave localization in random media). Using commensurable and 
incommensurable moiré patterns, we experimentally demonstrate the two- 
dimensional localization—delocalization transition of light. Moiré lattices may feature 
an almost arbitrary geometry that is consistent with the crystallographic symmetry 
groups of the sublattices, and therefore afford a powerful tool for controlling the 
properties of light patterns and exploring the physics of periodic-aperiodic phase 
transitions and two-dimensional wavepacket phenomena relevant to several areas of 
science, including optics, acoustics, condensed matter and atomic physics. 


One of the most salient properties of an engineered optical system 
is its capability to affect alight beam in a prescribed manner, such as 
to control its diffraction pattern or to localize it. The importance of 
wavepacket localization extends far beyond optics and impacts all 
branches of science dealing with wave phenomena. Homogeneous or 
strictly periodic linear systems cannot result in wave localization, and 
the latter require the presence of structure defects or nonlinearity. 
Anderson localization® is a hallmark discovery in condensed-matter 
physics. All electronic states in one- and two-dimensional potentials 
with uncorrelated disorder are localized. Three-dimensional systems 
with disordered potentials are known to have both localized and 


delocalized eigenstates“, separated by an energy knownas the mobility 
edge"’. Coexistence of localized and delocalized eigenstates has been 
predicted also in regular quasiperiodic one-dimensional systems, first 
inthe discrete Aubry—André” model and later in continuous optical and 
matter-wave systems®”°, Quasiperiodic (or aperiodic) structures, even 
those that possess long-range order, fundamentally differ both from 
periodic systems, where all eigenmodes are delocalized Bloch waves, 
and from disordered media, where all states are localized (in one or 
two dimensions). Upon variation of the parameters of a quasiperiodic 
system, it is possible to observe the transition between localized and 
delocalized states. Suchalocalization-delocalization transition (LDT) 
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Fig. 1|Moiré lattices, density of states and band structures. a—c, Moiré 
lattices with lattice intensity /(r), generated by two interfering square 
sublattices with p, =p, and axes mutually rotated by the angle indicated in each 
panel. Top row, calculated patterns. Middle row, schematic discrete 
representation of two rotated sublattices. Bottom row, experimental patterns 
at the output face of the crystal. The scale is the same for all images. 


has been observed in one-dimensional quasiperiodic optical” and in 
atomic systems~”?, 

Wave localization is sensitive to the dimensionality of the physical 
setting. Anderson localization and a mobility edge in two-dimensional 
systems were first reported in experiments with bending waves™ and 
later in optically induced disordered lattices”. In quasicrystals, localiza- 
tion has been observed only under the action of nonlinearity” and in the 
presence of strong disorder™. Although localization and delocalization 
of light in two-dimensional systems without any type of disorder and 
nonlinearity have been predicted theoretically for moiré lattices”® 
and very recently for Vogel spirals”’, the phenomenon has never been 
observed experimentally. 

Here we report the first, to our knowledge, experimental realization 
of reconfigurable photonic moiré lattices with controllable parameters 
and symmetry. The lattices are induced by two superimposed periodic 
patterns”$ (sublattices) with either square or hexagonal primitive cells, 
and have tunable amplitudes and twist angle. Depending on the twist 
angle, a photonic moiré lattice may have different periodic (commen- 
surable) structure or aperiodic (incommensurable) structure without 
translational periodicity, but it always features the rotational symmetry 
of the sublattices. Moiré lattices can also transform into quasicrystals 
with higher rotational symmetry”. The angles at which acommensura- 
ble phase (periodicity) of a moiré lattice is achieved are determined by 
Pythagorean triples in the case of square sublattices”, or by another 
Diophantine equation when the primitive cell of the sublattices is not 
a square (see Methods). For all other rotation angles, the structure is 
aperiodic albeit regular (that is, it is not disordered). Changing the 
relative amplitudes of the sublattices allows us to smoothly tune the 
shape of the lattice without affecting its rotational symmetry. 

Incontrast to crystalline moiré lattices’ ®, optical patterns are mon- 
olayer structures; that is, both sublattices interfere in one plane. Asa 
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d,e, Comparison of the density of states calculated for a moiré lattice (top) and 
its periodic approximation (bottom) at p,=0.1(d) and p,=0.2 (e). The 
approximate Pythagorean lattice has period b, = (3361 1 (see Supplementary 
Information). f, Band structures for a periodic lattice approximating a moiré 
lattice at p, = 0.1 (top; 15 upper bands are shown) and p, = 0.2 (bottom; 68 upper 
bands are shown). Inall cases p,=1 


consequence, light propagating in such media is described by a one- 
component field. In the paraxial approximation, the propagation of 
an extraordinarily polarized beam in a photorefractive medium with 
an optically induced refractive index is governed by the Schrodinger 
equation for the dimensionless field amplitude” wr, z): 


jov 


-lyty+ (1) 


0 
Ltn)” 


Here V = (0/0x, 0/0y); r = (x, y) is the radius vector in the transverse 
plane, scaled to the wavelength A = 632.8 nm of the beam used in the 
experiments; zis the propagation distance, scaled to the diffraction 
length 21n,/; n, is the refractive index of the homogeneous crystal for 
extraordinarily polarized light; £, > O is the dimensionless applied d.c. 
field; (r) =|p,V(r) + p,V(Sr)|’ is the intensity of the moiré lattice induced 
by two ordinarily polarized mutually coherent periodic sublattices, 
Vir) and (Sr), interfering in the (x, y) plane and rotated by angle 6 with 
respect to each other (see Methods); S = S(@) is the operator of the two- 
dimensional rotation; and p, and p, are the amplitudes of the first and 
second sublattices, respectively. The number of laser beams creating 
each sublattice V(r) depends on the desired lattice geometry. The form 
in which the lattice intensity /(r) enters equation (1) is determined by 
the mechanism of the photorefractive response. 

To visualize the formation of moiré lattices, itis convenient to asso- 
ciate a continuous sublattice V(r) with a discrete one that has lattice 
vectors determined by the locations of the absolute maxima of V(r). 
The resulting moiré pattern inherits the rotational symmetry of V(r). 
At specific angles some nodes of different sublattices may coincide, 
thereby leading to translational symmetry of the moiré pattern inthe 
commensurable phase; see the primitive translation vectors illustrated 
by blue arrows in Fig. 1a, c for the case of square sublattices. The rotation 
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Fig. 2| Form factor and moiré states. a, Form factor (inverse width) ofthe 
eigenmodes with largest B versus rotation angle 8 and versus the amplitude of 
the second sublattice, p,, at p,=1. The horizontal dashed line indicates the 
sublattice depth p}"" at which LDT occurs. The vertical dashed line shows one 
of the Pythagorean angles 9, = arctan(3/4). b,c, Examples of mode profiles with 
the largest B for p, <pyr (b) and p, > pyr (c). The insets show cuts of the In||? 


distribution along thexandyaxes. 


angles at which the periodicity of /(r) is achieved are determined by 
triples of positive integers (a, b,c) = N related by a Diophantine equa- 
tion characteristic for a given sublattice (see Extended Data Table 1). 

First, we consider a Pythagorean lattice created by two square sublat- 
tices. For rotation angles 0 such that cos@ = a/c and sin@ = b/c, where 
(a, b,c) isaPythagorean triple (that is, a*+b?=c’), I(r) isaperiodic moiré 
lattice. Such angles are hitherto referred to as Pythagorean. For all 
other, non-Pythagorean, rotation angles 6, the lattice /(r) is aperiodic. 
Figure la—c compares calculated /(r) patterns (first row) with lattices 
created experimentally” in a biased SBN:61 photorefractive crystal with 
dimensions 5 x 5 x 20 mm? (third row) for different rotation angles. The 
second row shows the respective discrete moiré lattices. Figure la, c 
shows periodic lattices, whereas Fig. 1b gives an example of an aperiodic 
lattice. All results were obtained for F,=7, which corresponds toa d.c. 
electric field of 8 x 10* Vm“ applied to the crystal. The amplitude of the 
first sublattice was set to p,=1inall cases, which corresponds to an aver- 
age intensity of /,, ~ 3.8 mW cm ~. For such parameters, the refractive 
index modulation depth in the moiré lattice is of the order of 6n=10*. 

Mathematically, incommensurable lattices are almost periodic 
functions*’. Any non-Pythagorean twist angle can be approached by 
a Pythagorean one with any prescribed accuracy (see Supplementary 
Information). Thus, any finite area of an incommensurable moiré lat- 
tice can be approached by a primitive effective cell of some periodic 
Pythagorean lattice, whereas a more accurate approximation requires 
alarger primitive cell of the Pythagorean lattice. This property is illus- 
trated in Fig. 1d, e by the quantitative similarities between the densities 
of states calculated for an incommensurable lattice and its effective-cell 
approximation. A remarkable property of Pythagorean lattices is the 
extreme flattening of the higher bands that occurs when the ratio p,/p, 
exceeds a certain threshold (Fig. 1f). The number of flat bands grows 
with the size of the area of the primitive cell of the Pythagorean lattice 
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Fig. 3| Output patterns of light propagating through moiré lattices. a-—c, 
Observed output intensity distributions, illustrating LDT with increasing 
amplitude p, of the second sublattice for rotation angle 6=arctan3 /?= 11/6 (a, 
c) and absence of LDT for the Pythagorean angle 9 = arctan(3/4) (b). Theinsets 
show the location of the excitation: central (a, b) and off-centre (c). 


approximation. Thus, anincommensurable moiré lattice can be viewed 
as the large-area limit of a periodic Pythagorean lattice with extremely 
flat higher bands. We note that the existence of flat bands for twisted 
bilayer graphene was discussed in refs. ’°*". Because flat bands sup- 
port quasi-nondiffracting localized modes, an initially localized beam 
launched into sucha moiré lattice will remain localized. This flat-band 
physics of moiré lattices, which is fundamentally different from that 
of Anderson localization in random media, allows us to predict light 
localization above a threshold value of the ratio p,/p,. Furthermore, 
flat bands support states that are exponentially localized in the primi- 
tive cell and that can be well approximated by exponentially localized 
two-dimensional Wannier functions” (see Fig. 2c and Supplementary 
Information). 

To elucidate the impact of the sublattice amplitudes and rotation 
angle 8 on the light localization, we calculated the linear eigenmodes 
W(r, z) = w(r)e (where £ is the propagation constant and w(r) is the 
mode profile) supported by the moiré lattices. To characterize their 
localization we use the integral form factor y= (U2 ff ipitd*r) , where 
u=ff |p? dr is the energy flow (the integration is over the transverse 
area of the crystal). The form factor is inversely proportional to the 
mode width: the larger the value of y, the stronger the localization. The 
dependence of the form factor of the most localized mode (the mode 
with largest £) on 6 and p, is shown in Fig. 2a (for modes with lower 
values of 6, the dependencies are qualitatively similar). One observes 
asharpLDT above a certain threshold depth pe of the second sublat- 
tice, atthe amplitude of the first sublattice, p,=1. Below p;”' allmodes 
are extended (Fig. 2b), and above the threshold some modes are 


cosé = 13/19 tand = 3-” tand = 3°” 


cosé = 11/14 


Fig. 4 | Moiré lattices created by superposition of two rotated hexagonal 
lattices. a—-d, Top row, moiré lattices produced by the interference of two 
hexagonal patterns rotated by angle 0 with p, = 1 (a, b, d) and p, = 0.18 (c). Middle 
row, schematic discrete representation of two rotated hexagonal sublattices. 
Bottom row, measured output-intensity distributions for the signal beam at the 
output face of the crystal. In all cases p,=1. Blue arrows in the middle panels ofa 
and bindicate the primitive translation vectors of the corresponding discrete 
lattice. 


localized (Fig. 2c). This is consistent with the extreme band flattening 
of the approximate Pythagorean lattice at p, > pe (Fig. 1f). The inset 
in Fig. 2c reveals exponential tails for p, > pe , from which the localiza- 
tion length for the most localized mode can be extracted. 

Figure 2a shows delocalization for angles 0 set by the Pythagorean 
triples when all modes are extended, regardless of the value of p,. It 
also reveals that py tis practically independent of the non-Pythagorean 
rotation angle. This is explained by the fact that a large fraction of the 
power inalocalized mode resides in the vicinity of a lattice maximum 
(that is, at r= 0). In an incommensurable phase, when ((r) < (0), for all 
r#Othe optical potential can be approximated by the Taylor expansion 
of E,/[1+/(r)] with respect to r near the origin. Such expansion includes 
the rotation angle 6 only in the fourth order (see Supplementary Infor- 
mation) and therefore locally can be viewed as almost isotropic. 

To study the guiding properties of the Pythagorean moiré lattices 
experimentally, we measured the diffraction outputs for beams propa- 
gating in lattices corresponding to different rotation angles 0 fora 
fixed input position of the beam, centred or off-centre. The diameter 
of the Gaussian beam focused on the input face of the crystal was about 
23 um, covering approximately one bright spot (channel) of the lattice 
profile. The intensity of the input beam was about 10 times lower than 
the intensity of the lattice-creating beam, /,,, to guarantee that the beam 
did not distort the induced refractive index and that it propagated in 
the linear regime. 

Experimental evidence of LDT inthe two-dimensional lattice is pre- 
sented in Fig. 3, where we compare output patterns for the low-power 
light beam in the incommensurable (tan@ =3 *; Fig. 3a and Fig. 3c for 
central and off-centre excitations, respectively) and commensurable 
(tan6 = 3/4, Fig. 3b) moiré lattices, tuning in parallel the amplitude p, 
of the second sublattice. When p, <p‘) (in Fig. 3 p$?" ~ 0.15), the light 
beam inthe incommensurable lattice notably diffracts upon propaga- 
tion and expands across multiple local maxima of /(r) inthe vicinity of 
the excitation point. However, when p, exceeds the LDT threshold, it 
is readily visible that diffraction is arrested for both central (Fig. 3a) 
and off-centre (Fig. 3c) excitations and a localized spot is observed at 
the output over alarge range of p, values. Inclear contrast, localization 
is absent for any p, value in the periodic lattice associated with the 


Pythagorean triple (Fig. 3b). Additional proof of the LDT is reported in 
Extended Data Fig. 1. We compare experimental and theoretical results 
for propagation at p, = 1. In an incommensurable lattice, at p, <p?" 
one observes beam broadening (top row). Localization takes place at 
p, > p'' (middle row). Ata Pythagorean twist angle, localization does 
not occur even for p,=p,=1(bottom row). Simulations of the propaga- 
tion to much larger distances beyond the available sample length 


(Extended Data Fig. 2) confirm localization of the beam in the incom- 


mensurable lattice at any distance at p, > p?' and its expansion at 
LDT 
p,<py". 


The mutual rotation of two identical sublattices allows the genera- 
tion of commensurable and incommensurable moiré patterns with 
sublattices of any allowed symmetry. To illustrate the universality of 
LDT, we created hexagonal moiré lattices using an induction technique 
similar to that used for single honeycomb photonic lattices®. For such 
lattices, the rotation angles producing commensurable patterns are 
given by the relation tan@ = b./3 /(2a + b), where the integers a, band 
c solve the Diophantine equation a? + b? + ab =c”. Two examples are 
presented in Fig. 4a, b. Insuch periodic structures, the light beam expe- 
riences considerable diffraction for any amplitude of the sublattices, 
as shown in the bottom row. To observe LDT, one has to induce aperiodic 
structures. To this end, we set the rotation angle to 30°. In this incom- 
mensurable case, we did observe LDT by increasing the amplitude of 
the second sublattice, keeping the amplitude p, fixed. Delocalized and 
localized output beams are shown in the lower panels of Fig. 4c, d. In 
Fig. 4c the ideal six-fold rotation symmetry of the output pattern is 
slightly distorted, presumably owing to the intrinsic anisotropy of the 
photorefractive response. At p, =p, the moiré pattern acquires a12-fold 
rotational symmetry (shown in Fig. 4d), as proposed in ref. "as amodel 
of a quasicrystal, and similar to the twisted bilayer graphene quasic- 
rystal reported in ref.°. 

Moiré lattices can be created in practically any arbitrary configura- 
tion consistent with two-dimensional symmetry groups, thus allowing 
the creation of potentials that may not be easily produced in tunable 
form using material structures. In addition to their direct application 
to the control of light patterns, the availability of photonic moiré pat- 
terns allows the study of phenomena relevant to other areas of physics, 
particularly to condensed matter, which are harder to explore directly. 
An outstanding exampleis the relation between conductivity/transport 
andthe symmetry ofincommensurable patterns with long-range order. 
The concept can bealso extended to atomic physics and in particular to 
Bose-Einstein condensates, where potentials are created using similar 
geometries (and where Anderson localization has been observed”). 
Finally, we note that whereas most previous studies of moiré lattices 
were focused on graphene and quasicrystals, our results suggest that 
the photonic counterpart affords a powerful platform for the creation 
of synthetic settings to investigate wavepacket localization and flat- 
band phenomena in two-dimensional systems at large. 
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Methods 


Experimental setup 

The experimental setup is illustrated in Extended Data Fig. 3. The lattice 
is created using optical induction, as described in ref. *? and was first 
realized experimentally in ref.**. A continuous-wave frequency-doubled 
Nd:YAG laser at a wavelength of A = 532 nm is divided by a polarizing 
beam splitter into two polarization components, which are sent to path 
aand path b separately. Light in path ais extraordinarily polarized and 
it is used to image the induced potential in the photorefractive crystal 
(bottom row of Fig. 1). Light in path bis ordinarily polarized and it is used 
to write the desirable potential landscape ina photorefractive SBN:61 
crystal with dimensions 5 x 5 x 20 mm? and extraordinary refractive 
index n, = 2.2817. Before entering the crystal, the ordinarily polarized 
light beam in pathb is modulated by masks 1 and 2 and is transformed 
into a superposition of two rotated periodic patterns. Their relative 
strength p,/p,—more precisely, the strength of the second lattice—as 
well as the twist angle 6 are controlled by the polarizer-based mask 1 
and the amplitude mask 2. A He-Ne laser with wavelength A = 633 nm 
shown in pathc provides an extraordinarily polarized beam focused 
onto the front facet of the crystal, which serves as a probe beam for 
studying light propagation in the induced potential. We record the 
output light intensity pattern using a charge-coupled device (CCD) 
at the exit facet of the crystal after a propagation distance of 20 mm. 


Characteristics of moiré lattices used in experiment 
Two types of moiré lattices were used in the experiments, and their 
characteristics are summarized in Extended Data Table 1. In all cases 


the centre of rotation in the (x, y) plane was chosen to be coincident 
with a node of one of the sublattices. 
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Extended Data Fig. 1| Evolution of light in moiré lattices. Experimentally 
observed intensity distributions of the probe beam (colour-surface plots) and 
corresponding theoretically calculated distributions (insets) at different 
propagation distances z, for tan0=3/” and p,=0.1 (below the LDT point; top 


20 mm 


20 mm 


20 mm 


row), tan@=37” and p,=1 (above the LDT point; middle row) and tanO=3/4 and 
pP,=1(bottom row). The two top rows correspond to the incommensurable 
Pythagorean lattice shown in Fig. 1b. The third row corresponds tothe 
commensurable lattice shown in Fig. Ic. 
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Extended Data Fig. 2| Numerical simulation of light propagationto distances, exceeding the sample length. c,d, Similar numerical results, but for 
distances beyond the crystal length. a, b, Numerical simulations of the light- an off-centre excitation position in the moiré lattice. p, = 0.1(a, c), p,=1.0 (b, d), 


beam propagation inthe incommensurable moiré lattice for centralexcitation, |0@=1/6.Inallcases,a Gaussian beam exciting a single site of the potential is 
corresponding tothe top and middle rows of Extended Data Fig.1,butforlarger assumed. 
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Extended Data Fig. 3 | Experimental setup. 1/2, half-wave plate; PBS, CCD, charged-coupled device. Mask 2 is an amplitude mask used to produce 
polarizing beam splitter; SF, spatial filter; L, lens; BS, beam splitter; ID, iris two group of sub-lattices with rotation angle 6, and mask lis made ofa 


diaphragm; M, mirror; P, polarizer; SBN, strontium barium niobate crystal; polarizer film. 


Extended Data Table 1| Characteristics of the moiré lattices used in the experiments 


Moiré lattice I(r)|Sublattice V(r) 


Diophantine equation | tan 0 
Pythagorean cos(2x) + cos(2y) a+P=c b/a 
hexagonal 5-3 _, cos [2(a cos Om + y sin On)]|a? + b? + ab = c? 


V'3b/(2a + b) 


For hexagonal lattices 0, = 0, 8, = 2n/3 and 0, = 4n/3. 
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Phase separation is a cooperative process, the kinetics of which underpin the orderly 
morphogenesis of domain patterns on mesoscopic scales’”. Systems of highly 
degenerate frozen states may exhibit the rare and counterintuitive inverse-symmetry- 
breaking phenomenon’. Proposed a century ago’, inverse transitions have been found 
experimentally in disparate materials, ranging from polymeric and colloidal 
compounds to high-transition-temperature superconductors, proteins, ultrathin 
magnetic films, liquid crystals and metallic alloys*, with the notable exception of 
ferroelectric oxides, despite extensive theoretical and experimental work on the latter. 
Here we show that following a subcritical quench, the non-equilibrium self-assembly of 
ferroelectric domains in ultrathin films of Pb(Zr ,Ti.)O; results ina maze, or 
labyrinthine pattern, featuring meandering stripe domains. Furthermore, upon 
increasing the temperature, this highly degenerate labyrinthine phase undergoes an 
inverse transition whereby it transforms into the less-symmetric parallel-stripe domain 
structure, before the onset of paraelectricity at higher temperatures. We find that this 
phase sequence can be ascribed to an enhanced entropic contribution of domain walls, 
and that domain straightening and coarsening is predominantly driven by the 
relaxation and diffusion of topological defects. Computational modelling and 
experimental observation of the inverse dipolar transition in BiFeO, suggest the 
universality of the phenomenon in ferroelectric oxides. The multitude of self-patterned 
states and the various topological defects that they embody may be used beyond 
current domain and domain-wall-based’ technologies by enabling fundamentally new 
design principles and topologically enhanced functionalities within ferroelectric films. 


To investigate polarization self-patterning, we use an ab initio-based 
effective Hamiltonian approach® and examine ultrathin films of 
Pb(Zro4Tip,,)O; (see Methods), as these widely used quasi-two-dimen- 
sional ferroelectric systems are known to exhibit various modulated 
phases depending on the interplay between strain and the amount of 
screening of surface charges* ”. 

It is worth noting that two underlying nested symmetry-breaking 
processes are at play in these systems and involve two distinct dynam- 
ical length scales. Whereas compressive strain introduces crystalline 
anisotropy and favours dipoles with orientation perpendicular to the 
film plane’’ (cubic symmetry is reduced to a quasi-Z, symmetry), the 
depolarizing field arising from incomplete screening of surface charges 
essentially imposes zero net polarization, and instead favours the for- 
mation of multiple mesoscopic domains as a result of the spontaneous 
breaking of the residual discrete symmetry. These domains of opposite 
polarization alternate along in-plane directions, and each consists of 
ferroelectrically ordered ensembles of dipoles. More precisely, while 
an individual dipole retains the freedom to flip between the [001] and 
[001] out-of-plane directions, an individual domain, as an emergent 


mesoscopic degree of freedom, has the propensity to align along either 
the [100] (horizontal) or the [010] (vertical) in-plane tetragonal direc- 
tions, owing to the underlying square lattice geometry’. Naturally, the 
dynamics pertaining to the motion and relaxation of domains is slower 
than that of individual dipole fluctuations, and this very fact poses 
important questions as to what extent domain dynamics and their 
morphology will be kinetically constrained. 

One manifestation of this kinetic constraint resides in the possibil- 
ity of obtaining two distinct modulated phases at low temperatures 
depending on the cooling rate. While the well known parallel-stripe 
domain pattern (Fig. 1a) emerges as the ground state upon adiabatically 
cooling (annealing) the system®”, the labyrinthine domain polarization 
pattern (Fig. 1b) onsets upon abruptly cooling (subcritical quenching) 
the system. The latter pattern consists of convoluted stripes and mean- 
dering domains and has a very close internal energy that is only 0.6% 
higher than that of the ground state. Interestingly, inquiring into the 
stability of the labyrinthine state at T> OK, we find that the eigenvalues 
of the Hessian matrix of the Hamiltonian are closely distributed around 
zero, with 75% of them being negative, indicating that the labyrinthine 


"Physics Department and Institute for Nanoscience and Engineering, University of Arkansas, Fayetteville, AR, USA. “Unité Mixte de Physique, CNRS, Thales, Univ. Paris Sud, Université Paris- 
Saclay, Palaiseau, France. °School of Physical Science and Technology, Soochow University, Suzhou, China. “Institute of Physics and Physics Department, Southern Federal University, 
Rostov-na-Donu, Russia. °Université d’Evry, Université Paris-Saclay, Evry, France. “Laboratoire Structures, Propriétés et Modélisation des Solides, CentraleSupélec, UMR CNRS 8580, Université 


Paris-Saclay, Gif-sur-Yvette, France. *e-mail: yousra.nahas@gmail.com 


Nature | Vol577 | 2 January 2020 | 47 


Article 


Ny 7 
nial 


vm a Se ; 
yh : 


Fig. 1| Stripes versus maze at low temperature. a, Ground-state dipolar 
configuration (parallel stripes) in the middle layer of an 80 x 80 x 5 unit-cell film 
of Pb(Zro,Tig,)O3; as obtained upon slowly decreasing the temperature from 
650 Kto10K.b, Dipolar configuration of the maze or labyrinthine pattern as 
obtained upon abruptly cooling the system from 650 K to10K. Grey (red) 
dipoles are oriented along the [001] ([001]) pseudo-cubic direction. 


state is weakly unstable”. Furthermore, we find that at 10 K, the labyrin- 
thine state has a quasi-vanishing relaxation rate without evidence of a 
growing static cooperative length, similarly to a glass-like kinetically 
arrested state”*. The slightly off-equilibrium labyrinthine structure 
only asymptotically departs from the state in which it initially vitrified, 
hence being effectively stationary at T> O K in the thermodynamic 
limit. The frozen labyrinthine state retains some of the properties of 
the high-temperature paraelectric state (similarly to the common local 
structure exhibited by glasses and their liquid phases), such as the over- 
all absence of long-range orientational order at the mesoscale mirrored 
by its structure factor, which has aring-shaped spectral weight (inset of 
Fig. 2b). However, the spectral weight distribution is deformed by the 
underlying four-fold square lattice anisotropy (four-peaked crown), 
signalling that the labyrinthine domain pattern is only weakly disjoint 
from the square symmetry of the lattice geometry. In fact, upon sub- 
critical quenching of the system, we discern a local tendency of adjacent 
domains to order by adopting one of the two lower equilibrium states 
of the Hamiltonian (either horizontal [100] or vertical [010] periodicity 
of parallel stripes is associated with the two-fold-degenerate ground 
state). This local ordering process ensures a local minimization of the 
energy and can extend only up to a certain finite length scale, beyond 
which collective mismatch and surface tension effects hinder further 
ordering’””’. In this regard, the low-temperature labyrinthine state can 
be apprehended as a mosaic pattern consisting of a spatial mixture of 
tiles with different realization of local order. This labyrinthine state 
inherently features frustration due to the unresolvable competition 
between local interactions and the long-ranged dipolar interaction’. 

Upon heating the labyrinthine state, thermal activation effects come 
into play, and the resulting kinetic unfreezing elicits the phenomenon 
of inverse transition, whereby a state with higher symmetry transforms 
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into alower-symmetry one. This is shown in Fig. 2b-g, where upon 
increasing the temperature, the more symmetric labyrinthine phase 
obtained by quenching the system from 650 K to 10 K experiences a 
lessening of its junctions, resulting in a transient reordering and the 
occurrence of the less-symmetric parallel-stripe state at higher tem- 
peratures. This inverse transition onsets at atemperature of 7;,,~ 200K, 
before transitioning to the paraelectric state at a transition temperature 
of T, = 380 K (Extended Data Figs. 1, 2). As the temperature increases, 
the distribution of the spectral weight of the structure factor gradu- 
ally yields two primary spots along the direction of the Brillouin zone, 
perpendicular to the direction of the stripes in real space, mirroring 
the acquired long-range orientational order. This inverse symmetry 
breaking can be quantified using the order parameter O,,,=(n,-7,)/n, 
where nis the total number of first nearest-neighbour pairs of dipoles 
having the opposite sign to their zcomponent, and n, (n,) is the num- 
ber of horizontal (vertical) bonds among such dipoles'”°. The average 
of this quantity over 100 labyrinthine realizations is shown in Fig. 2a 
and its evolution with temperature captures the sequential onset of 
three distinct phases: a low-temperature labyrinthine phase with no 
net orientation, which bears the symmetry of the underlying lattice; a 
mid-temperature broken-symmetry phase with distinguishable orien- 
tation of domains that are all oriented as stripes along acommonaxis; 
and a high-temperature disordered paraelectric phase characterized 
by the dissolution of domains and domain walls. 

As a general energetic feature of domain walls within modulated 
phases’, we find that the gain realized by short-range interactions is 
counterbalanced by the cost of the dipolar interaction, which plays 
an important—if not dominant—role (Extended Data Figs. 3, 4). The 
excess length of domain walls within the labyrinthine state therefore 
yields an excess in the dipolar cost when compared with the parallel- 
stripe domain structure. We find that this excess gradually reduces with 
increasing temperature as a result of the straightening of meandering 
stripes, and vanishes at T,,, (Extended Data Fig. 5a). 

We experimentally observed such an inverse-transition phenomenon 
in BiFeO, thin films (Fig. 3a), in agreement with our first-principles 
computations (Fig. 3c). The 95-nm-thick BiFeO; layer was grown by 
pulsed-laser deposition on a (110)-oriented orthorhombic DyScO, 
substrate at 933 K (Extended Data Figs. 9, 10) and, after having been 
cooled to room temperature, exhibited alabyrinthine domain structure 
(Fig. 3a; as-grown). We then performed series of experiments in which 
the as-grown sample was first annealed for 1h at an elevated target tem- 
perature and then cooled to room temperature with an effective cooling 
rate of2 K min”. The ferroelectric domain landscape observed at room 
temperature after annealing at 773 K, 1,023 K and 1,073 K is shownin 
Fig. 3a. As can be readily seen, for target temperatures up to 1,023 K 
the labyrinthine morphology is retained, while following the 1,073 K 
annealing, a profound modification to a perfect stripe domain pattern 
onsets. The increased ordering of the ferroelectric array was confirmed 
macroscopically by X-ray diffraction (XRD) measurements (Extended 
Data Fig. 12) following the pioneering work of Streiffer et al.”. These 
experiments indicate an inverse transition (7,,,) between 1,023 K and 
1,073 K, while no transition to the paraelectric state could be detected 
by XRD upto1,160 K (Extended Data Fig. 11). Using conducting atomic 
force microscopy measurements, we also found that elementary point 
defects (Fig. 3b) are characterized by enhanced conduction that can 
be up to 50 times larger than the conduction at straight segments of 
domain walls. Indeed, we found that the typical current level is 0.2 pA 
in domains, 0.5-1.0 pA at domain walls, 15 pA at end-point defects and 
50 pA at three-fold junctions. The inverse transition in BiFeO, is also 
seen in our first-principles effective Hamiltonian simulations” ”, which 
yield T,,, = 1,100 K and 7, ~ 1,300 K for BiFeO, films (Fig. 3c). Interest- 
ingly, the antiferrodistortive (AFD) degrees of freedom in BiFeO,, albeit 
coupled to the ferroelectric order parameter”, do not hamper the 
onset of the inverse transition (Extended Data Figs. 6, 7). These first- 
principles-obtained numerical results, along with our experimental 
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Fig. 2| Inverse transition simulations. a, Temperature dependence of the 
orientational order parameter O,, upon slowly heating the labyrinthine state of 
an 80 x 80 x5 film of Pb(Zro 4Tio,.)O3. b-g, The evolution of the labyrinthine 
domain pattern inthe middle layer of the film with increasing temperature: 

10 K (b), 110 K (c), 185 K (d), 260 K (e), 335 K (f) and 410 K (g). Grey (red) dipoles 
are oriented along the [001] ([001]) pseudo-cubic direction. The structure 
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Fig. 3 | Experimental observation and simulations of the inverse transition 
in BiFeO, films. a, In-plane piezoresponse force microscopy phase images of a 
95-nm-thick BiFeO, film grown ona (110)-oriented orthorhombic DyScO, 
substrate, for the as-grown sample, and the same sample after annealing at 
773K, 1,023 Kand 1,073 K. The images are 5 x 5 um”. b, In-plane piezoresponse 
force microscopy image of a30-nm-thick BiFeO, film grown on SrRuO; (10 nm)/ 
DyScO,(110) (top left). Scale bar, 2 4m. Conducting atomic force microscopy 
(current mapping) images acquired with V,,. =1.7 Vapplied ona SrRuO,; bottom 
electrode in periodic stripy areas (bottom left) and defected areas (red dashed 
lines) with high-conduction spots at three-fold junctions (top right) and end- 


findings and other computations given in Methods, demonstrate that 
the inverse-transition phenomenon is robust against intrinsic and 
extrinsic parameters such as boundary conditions, film thickness, as 
well as screening conditions and misfit strain. Naturally, varying the 
conditions in the studied systems (Pb(Zro,Tio,)O3 and BiFeO;) yields 
different transition temperatures, as well as different types of domain 
walls (for example, 180° in Pb(Zro4,Tio,,)O; versus 109° and 71° domain 
walls in BiFeO;) and labyrinthine morphologies, with no incidence on 
the occurrence of the inverse transition. 


factor plots obtained by Fourier transformation of the zcomponent of the 
corresponding dipolar field are also provided, where aq, and aq, are thexandy 
components of the dimensionless wave vector, which take values within the 
interval from-to +1 (ais the in-plane lattice constant). The colouring 
corresponds to the value of the structure factor, with white (pink) indicating 
the lowest (highest) value. The colour scale is the same for all plots. 
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point (bottom right) topological defects. Scale bars, 500 nm.c, Distribution of 
the zcomponent of polarization (red to green indicate negative to positive 
values) ina middle layer of BiFeO; film at different temperatures, as obtained 
from 36 x 36 x 10 supercell effective-Hamiltonian-based Monte Carlo 
simulations under periodic boundary conditions with ideal short-circuit 
screening and isotropic misfit strain of -0.16%. The system was abruptly 
quenched from 2,000 K to10 K and consequently progressively heated up with 
40,000 relaxation sweeps at each temperature. In the simulations, we find that 
below 7,,,, the system exhibits mixed 109° and 71° domain walls, while above 
Tiny only 109° domain walls are observed. 


Figure 4a provides the evolution with temperature of the free-energy- 
like potential associated with the transverse component of dipoles at 
the domain walls. Each curve is obtained by averaging over the distri- 
butions of 100 labyrinthine realizations. Results are gathered at 10 K, 
110 K and 210 K upon heating the labyrinthine states and the transverse 
component is taken to be the projection of dipoles onto the domain- 
wall normalat each point. At 10K, the potential features three minima, 
the leftmost and rightmost ones being associated with the Néel nature 
of the domain walls. The minimum at zero is associated with dipoles 
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Fig. 4| Energy, topological defects and memory effects. a, Free-energy-like 
potentials of the labyrinthine domain pattern at 10K, 110 K and 210 K. Curves 
are obtained by calculating the logarithm of the probability distribution 
functions p (averaged over the distributions of 100 labyrinthine realizations) of 
the transverse to the domain wall component of the local modes, u,,, within an 
80 x 80 x5 Pb(Zro.4Tip.,)O3 supercell. Note that the local mode wis a vector 
proportional to the electric dipole moment. b,c, Stripe end-points (b) and 
three-fold junctions (c). d, Evolution with temperature of the densities d (per 
square nanometre) of stripe end-points d, and three-foldjunctions d,;. The 
insets show the evolution with increasing temperature (from top to bottom) of 
the labyrinthine stripe morphology within a portion of the middle layer of the 


residing at the boundaries of the domain walls, the orientation of which 
is along the [001] and [001] directions. Increasing the temperature 
leads to a gradual flattening of the minima, ultimately resulting ina 
single minimum potential at u,, = 0 for T= T,,,. The gradual lessening 
of barrier heights is associated with increased thermal fluctuations of 
dipoles, which not only lead to more corrugated walls but also enable 
the reorientation of the meandering stripes. In this regard, the barrier 
softening of the transverse components of the domain-wall dipoles 
undermines surface tension effects and enhances wall fluidity. The loss 
of configurational entropy subtending the parallel reorientation of 
the labyrinthine stripes (greater mesoscopic order) is offset by the 
increase of the vibrational entropy of dipoles (greater microscopic 
disorder). Frozen in the ‘disordered’ high-symmetry labyrinthine phase, 
the transverse components of the dipoles melt in the ‘ordered’ low- 
symmetry parallel-stripe state. In this sense, the inverse-transition 
phenomenon, although seemingly counterintuitive, is only inverse 
from the mesoscopic symmetry standpoint, as it can be fathomed 
without violating the laws of thermodynamics or relying on a para- 
doxical inverse entropic scenario*””*"’. 

This configurational entropy reduction can be rationalized by regard- 
ing the labyrinthine domain pattern as a fragmented, mosaic state 
composed of tiles with a ground-state morphology of a local parallel- 
stripe arrangement of domains. Such local realization of mesoscopic 
order within each tile is favoured by the dipolar interaction that stabi- 
lizes parallel adjacent stripes. Within the mosaic ansatz, an estimate 
of the degeneracy of the labyrinthine phase yields 2! /*" (where €is the 
typical lateral length of a tile and Z is the lateral size of the supercell), 
because each of the L?/& tiles can locally harbour a parallel-stripe align- 
mentalong either the [100] or the [010] direction. These exponentially 
many labyrinthine states are statistically equivalent while being mor- 
phologically incongruent’®. As expected, fis atemperature-dependent 
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simulated 80 x 80 x 5 Pb(Zry 4Tip.,)O; film. The dark area highlights a local 
straightening process of neighbouring stripes upon increasing the 
temperature. e, Temporal evolution at 10 K of the internal energy per unit cell 
of a56x 56x 5supercell of Pb(Zro ,Tip,.)O; during the relaxation of the two 
bubble states upon removal of the external electric field. The dark (bright) 
curve corresponds to the relaxation of the bubble state obtained from electric 
field treatment of the labyrinthine (parallel-stripe) pattern. f,g, Consecutive 
snapshots of such temporal evolutions (from top to bottom) of the two bubble 
states just after removal of the field for the parallel-stripe (f) and labyrinthine 
(g) initial patterns. Snapshots correspond to the middle layer of the supercell, 
where dark and white regions represent the [001] and [001] dipole orientations. 


quantity, as can be seen in Extended Data Fig. 5b. Approaching 7, from 
lowtemperatures, becomes comparable to the lateral size of supercell 
L, indicating the onset of a global symmetry-breaking and long-range 
parallel arrangement of stripes. We find that the coarsening of 
structures is conveyed by the diffusion and relaxation of topological 
defects localized at the junction of different tiles and reconciling 
discrepancies in their prevailing local orientations and/or wave- 
lengths®. The examination of elementary point topological defects 
indeed shows that the densities d, and d,, of stripe end-points (or con- 
vex disclinations of +1/2 Pontryagin charge”*””) and three-fold junctions 
(or concave disclinations of -1/2 Pontryagin charge”®), respectively 
(Fig. 4b, c), feature a gradual lessening upon approaching T,,,, from low 
temperatures (Fig. 4d). We find that domain coalescence is driven by 
the recombination/annihilation of defects**”’, whereby, for instance, 
apair of concave-convex disclinations rebinds into a diffusing disloca- 
tion (inset of Fig. 4d) yielding a straightening of the labyrinthine 
pattern*”°. 

Rather unexpectedly, we find that these modulated phases (stripe 
and labyrinthine domain arrangements) are endowed with memory. 
Upon applying an electric field perpendicular to the film plane, the 
ground state of the stripe domains transforms into ananobubble phase 
before yielding a monodomain state at high enough electric field val- 
ues”, We find that the labyrinthine state exhibits an equivalent sequence 
of electric-field-induced morphological transitions, that is, from laby- 
rinthine to bubble to monodomain states, with increasing magnitude 
of the external field. The two bubble states obtained from either the 
stripe domains or the labyrinthine ones are energetically equivalent. 
Notably, upon releasing the stabilizing external field, each of the two 
bubble states relaxes back to its parent state morphology, obtained 
before any electric field treatment. This can be seen in Fig. 4e-g, which 
provides the temporal relaxation as obtained from molecular dynamics 


simulations of each of the bubble states upon removal of the field. This 
history-dependent behaviour is rooted ina complex energy landscape 
and attests of an original intrinsic memory effect, the seed of which lies 
in the arrangement of the bubble array (Extended Data Fig. 8). 

In summary, we report an inverse phase sequence in ferroelectric 
films, whereby a high-symmetry kinetically arrested labyrinthine phase 
transforms into a less-symmetric parallel-stripe domain structure 
upon increasing the temperature. Such an inverse transition involves 
pattern straightening and coarsening and is predominantly driven 
by the relaxation and diffusion of point topological defects. We also 
experimentally show that these nanometric defects encompass up to 
50 times larger conductivity when compared with straight domain 
wall segments and numerically demonstrate that the self-assembled 
dipolar configurations are endowed with an original memory effect. 
These findings will allow the development of novel applications of fer- 
roelectric films in logic and storage devices, as well asin memristors® ** 
for neuromorphic computing. 
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Methods 


Computational details 

We mimic Pb(Zr,,Tig,)O; ferroelectric ultrathin films that are grown 
along the (001) direction (whichis chosen to be the zaxis) and are Pb-O 
terminated at all surfaces/interfaces. The studied films typically have 
a thickness of 2.0 nm (that is, of five unit cells), and are subjected to 
a compressive strain of —2.65% to ensure that dipoles have a prefer- 
ential direction along the out-of-plane direction. Such a value would 
approximately account for the mismatch of lattice constants of the 
cubic phases of strontium titanate and Pb(Zry ,Tip,)O3. They are inter- 
posed between (realistic) electrodes that can screen only 80% of the 
polarization-induced surface charges, and modelled by various L x L x 5 
supercells that are all periodic along the [100] and [010] directions while 
finite along the z axis. Technically, a first-principles-based effective 
Hamiltonian” is used within Monte Carlo simulations to determine the 
energetics and local electric dipoles in each perovskite five-atom cell 
of these supercells. The validity of this approach was demonstrated by 
previous theoretical studies of ultrathin Pb(Zr, ,Tio.)O; films under 
compressive strains that (1) yield 180° up and down stripe domains 
that periodically alternate along [100] (or along [010]) for their ground 
state®”, in agreement with experimental observation” (note that ‘up’ 
(respectively, ‘down’) domains refer to domains in which the zcompo- 
nent of the dipole is parallel (respectively, antiparallel) to the z axis, 
respectively); (2) predict a linear dependency between the width of 
these periodic stripes and the square root of the film’s thickness”, as 
consistent with recent measurements”; and (3) have also led to the 
prediction of various topological defects such as vortices”, dipolar 
waves*®, bubbles" and merons (or convex disclinations)” in ferro- 
electrics, which have been experimentally confirmed®””*”. Note that 
the predicted temperature has to be rescaled by a factor of -1.6 with 
respect to measurements”. It is also worthwhile clarifying the role 
of thickness in the observed inverse transition. The phenomenon 
is expected to survive as long as the thickness of the film allows the 
stripe domain arrangement, where the morphological alteration as 
the thickness of the film increases should mainly affect the width of 
the domains". 


Additional insights from the computations 

Extended Data Fig. la—f shows the evolution of the parallel-stripe 
ground state upon slowly increasing temperature. In refs. *”*?, the 
authors studied the morphology of equilibrium domain patterns 
depending on the magnitude of gradient terms within the classical 
Landau-Ginzburg-Devonshire theory, and found that the labyrinthine- 
like ground state can be stabilized ifthe gradient energy is sufficiently 
reduced. Therefore, we can conclude that in our case, the effective 
gradient energy is above the critical value of the gradient that grants 
the parallel-stripe ground state upon slowly annealing the system. In 
Extended Data Fig. 1g, we provide the temperature variation of the 
scaled structure factor S(aq, T), where S(aq, T) is taken as the ratio 
of S(aq,, T) to S(aq,,10 K), ais the lattice parameter and q, is the g point 
corresponding to the wavelength of the striped phase modulation. 
S(aq, T) is calculated as the thermodynamic average of the squared 
norm of the three-dimensional discrete Fourier transform of the z 
component of local dipoles u,. Looking into the behaviour of S(aq, T) 
, it can be readily seen that paraelectricity onsets at 7. ~ 380 K. In 
Extended Data Fig. 2, we show the evolution with temperature of the 
specific heat C upon (1) heating the parallel-stripe ground state and 
upon (2) heating the low-temperature labyrinthine kinetically arrested 
state. While the first curve exhibits only one peak around 7,, the second 
curve features two peaks, one at the inverse transition temperature 
(Tiny) and one at T,. Cis extracted from the supercell energy fluctuations 
C= (1/k,T)(<E”) - (E)*) , where (E) corresponds to the average over 
Monte Carlo sweeps of the internal energy £and (E~)to that of its square, 
and where k, is the Boltzmann constant. 


Extended Data Figs. 3, 4 provide the probability density functions of 
the cell-by-cell energies calculated for the labyrinthine domain struc- 
ture at 10 K for Pb(Zry4Tio,,)O; within a 64 x 64 x 5 supercell. Extended 
Data Fig. 3a—d refers to the on-site energy, first nearest neighbours 
(INN) interaction energy, second nearest neighbours (2NN) interaction 
energy, and dipole-dipole interaction energy, respectively. Extended 
Data Fig. 4a-c pertains to the third nearest neighbours (3NN) interac- 
tion energy, elastic energy and electrostrictive energy, respectively. 
The mappings of each contribution to the energy onto the middle 
layer of the film are also provided in these figures. It is therein seen 
that while the on-site energy, the second and third nearest neighbours 
interaction energy, as well as the electrostrictive energy feature energy 
gain at the domain walls, the dipole-dipole interaction is the main 
counterbalancing cost. 

In Extended Data Fig. 5a, we show the dependence on tempera- 
ture of the dipole-dipole energy density upon heating each of the 
ground-state parallel-stripe domain pattern and the kinetically arrested 
labyrinthine state. It can be seen that before the onset of the inverse 
transition (around 200 K), the excess dipolar energy inthe labyrinthine 
state gradually reduces with increasing temperature as a result of the 
straightening of meandering stripes. Also provided in this figure is the 
estimate of the evolution with temperature of the dipolar energy den- 
sity of a fictive labyrinthine state whose serpentine stripe domains are 
artificially precluded from straightening. The mismatch between the 
curves associated with each of the fictive and real evolution of labyrin- 
thine state establishes that the labyrinthine domain pattern effectively 
reduces its energy upon increasing temperature by adopting a parallel 
reordering ofits stripes. In Extended Data Fig. 5b, we show the growth 
with temperature of the tile typical lateral length €. Upon approaching 
T,ny from low temperatures, € becomes comparable to the lateral size 
of the supercell L, indicating the onset of a global symmetry-breaking 
and long-range parallel arrangement of stripes. 

We performed additional first-principles-based effective Hamilto- 
nian simulations” * for BiFeO, films of different geometries. Specifi- 
cally, we simulated thick (with respect to the lattice constant) BiFeO, 
films where local modes (proportional to dipoles) are centred on the 
A-sites of the perovskite structure. The supercell size was 36 x 36 x10 
and subjected to compressive strain of —0.16% (Extended Data Fig. 6). 
We have also examined ultrathin BiFeO, films using the film effective 
Hamiltonian model where local modes (proportional to dipoles) are 
centred onthe B-sites of the perovskite structure. In this second case, 
the film thickness was taken to be five unit cells (as in the simulations 
on Pb(Zr,,Ti,,.)O; ultrathin film), and misfit strain was set to —-0.5%. 
Partial screening electric boundary conditions at film interfaces were 
used (Extended Data Fig. 7). Both numerical approaches listed above 
include AFD degrees of freedom in addition to variables describing 
inhomogeneous and homogeneous strain as well as local mode vectors. 
The employed Hamiltonians incorporate, among other terms, the cou- 
pling of AFD and ferroelectric degrees of freedom, as well as short range 
interactions of each of the two order parameters. Both Extended Data 
Figs. 6 and 7 show that upon heating the deep-quench-obtained low- 
temperature configurations, the domain pattern gradually transforms 
into parallel-stripe domains. Interestingly, the AFD vectors feature 
similar behaviour with increasing temperature for both investigated 
BiFeO; film geometries. These first-principles-obtained numerical 
results, along with their experimental realizations, demonstrate that 
the inverse-transition phenomenonis robust against boundary condi- 
tions, film thickness, as well as screening conditions and misfit strain. 

We also provide additional details regarding the discovered memory 
effect. We found that bubbles emerge from either the labyrinthine or 
the parallel-stripe states starting from an applied external field value 
of 32x10’ Vm. Beyond the threshold field of 42 x10’ Vm", the system 
forgets its history and does not relax back to the original state. This 
value is below the field value of 52 x10’ Vm ‘that induces the transition 
tothe monodomain state. Note that typically, theoretical electric fields 


are about 20 times larger than the experimental ones**. We found that 
the seed underlying this memory effect is rooted in the arrangement of 
bubbles. The array of bubbles obtained from the parallel-stripe phase 
shows two additional peaks in its structure factor plot, at the position 
of the wave vectors that define the periodicity of the parent stripe 
state. Such peaks are absent in the structure factor characterizing the 
array of bubbles obtained from the labyrinthine state (Extended Data 
Fig. 8). 


Experimental details 

The BiFeO, thin film was grown by pulsed laser deposition on a (110)-ori- 
ented DyScO; substrate using an excimer laser. First, a5-nm-thick elec- 
trode of SrRuO, was deposited at 933 K under 0.2 mbar of oxygen witha 
laser frequency of 5 Hz. The 95-nm-thick BiFeO, film was grown at 933 K 
under 0.36 mbar of oxygen with a laser frequency of 1 Hz. The bilayer 
was then cooled downto room temperature under 300 mbar of oxygen. 
The XRD pattern shows the monoclinic (001) orientation of BiFeO, 
with Laue fringes attesting the high quality of the epilayer. Piezore- 
sponse force microscopy (PFM) indicates a homogeneous out-of-plane 
polarization direction towards the SrRuO, electrode. The in-plane PFM 
contrast shows two alternating variants with 71° domain walls (Fig. 3a). 
We conducted successive ex situ annealing experiments under oxygen 
flow on this sample increasing the maximum temperature from 773 K 
to1,073K, ramping at 20 K min‘ from room temperature and keeping 
the maximum temperature constant for 1h. The cool down process 
was limited by the inertia of the oven and we estimate the cooling rate 
to be around 2 K min”. The resulting PFM domain structure evolu- 
tion is shown in Fig. 3a for annealing temperatures of 773 K, 1,023 K 
and 1,073 K. No substantial change was reported in the maze-like pat- 
tern upto1,023 K, while a profound modification to perfectly straight 
lines is observed after the 1,073 K annealing. Note that the PFM images 
were taken onrandom zones of the 5 x 5 mm? sample. While the surface 
topography shows surface desorption in addition to the preserved step 
and terrace structure (Extended Data Fig. 9), XRD does not reveal any 
structural changes induced by the successive annealing (Extended 
Data Fig. 10). 

We additionally conducted PFM experiments with an atomic force 
microscope (Nanoscope V multimode, Bruker) and external SR830 
lock-in detectors (Stanford Research) for simultaneous acquisition of 
in-plane and out-of-plane responses. A DS360 external source (Stanford 
Research) was used to apply the a.c. excitation to the SrRuO, bottom 
electrode at a frequency of 35 kHz while the conducting platinum- 
coated tip was grounded. The out-of-plane response is homogeneous 
in accordance with the homogeneous pristine downward polariza- 
tion all over the BiFeO, thin film. Current maps were acquired with 
the same tip connected toa transimpedance amplifier (TUNA, Bruker) 
with V,. =1.7 V applied onthe SrRuO, bottom electrode. The datashow 
enhanced conduction for labyrinthine defects as reported in Fig. 3b. 

XRD measurements as a function of temperature were performed 
using a high-resolution two-axis diffractometer equipped with a rotat- 
ing anode generator of 18 kW (Rigaku), with a Bragg-Brentano geom- 
etry and a50-cm-diameter focalization circle allowing an accuracy as 
high as 0.0002 A in 20. The (002) out-of-plane pseudo-cubic Bragg 
peak of BiFeO, thin film grown on SrRuO,/DyScO, is measured between 
300 K and 1,160 K (precision better than1K) andastep of 20 K. Above 
1,160 K, the film decomposes. From the measured Bragg peak posi- 
tion, the out-of-plane unit cell parameter is extracted and reported 
in Extended Data Fig. 11 and shows a quasi-linear variation of the film 
parameter with temperature, which indicates that there is no phase 
transition up to 1,160 K. 


As visible from Extended Data Fig. 12, we observe the same features 
as Yang et al.” in the reciprocal space mappings measured in our 
BiFeO, thin films and the relative intensity of the ‘superlattice’ peaks 
is increased after annealing. From thein-plane PFM image after anneal- 
ing (Extended Data Fig. 12b), we estimate the width of the domains (or 
the periodicity of the domain walls) to be 90 + 5 nm. Consistently, the 
satellites around the (002) BiFeO; film peak (Extended Data Fig. 12d) 
correspond toa periodicity of 95 + 5nm. We checked that these features 
disappear when aligning the X-ray beam parallel to the ferroelectric 
stripes (®=0°), and doing the same reciprocal space mappings around 
(002). 
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Extended Data Fig. 2| Specific heat of the parallel-stripe and labyrinthine states. Specific heat Casa function of temperature (in arbitrary units). Data were 
gathered uponslowly heating the ground-state parallel-stripe domain pattern (1) and the labyrinthine domain pattern (2). 
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Extended Data Fig. 3| Spatial distribution of on-site, first and second 
nearest neighbours and dipole-dipole interaction energies. a—d, The 
probability density functions of the cell-by-cell energies (on-site energy (a), 
first nearest neighbours (INN) interaction energy (b), second nearest 


neighbours (2NN) interaction energy (c) and dipole-dipole interaction energy 
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(d)) calculated for the labyrinthine domain structure at 10 K for Pb(Zro.4Tio..)O3 
within a 64 x 64 x 5 supercell. Each panel provides the contributions stemming 
from the domains and domain walls, separately. e-h, The corresponding 
mappings of energies onto the middle layer of the film. Blue to red colour 
gradient shows increasing values of unit-cell energies. 
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Extended Data Fig. 4 | Spatial distribution of third nearest neighbours, 


elastic and electrostrictive energies. a—c, The probability density functions 
of the cell-by-cell energies (third nearest neighbours (3NN) interaction energy 


(a), elastic energy (b) and electrostrictive energy (c)) calculated for the 
labyrinthine domain structure at 10 K for Pb(Zro 4Tio.,)O; within a 64 x 645 
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supercell. Each panel provides the contributions stemming from the domains 
and domain walls, separately. d-f, The corresponding mappings of energies 
onto the middle layer of the film. Blue to red colour gradient shows increasing 
values of unit-cell energies. 
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Extended Data Fig. 5| Energetics and spatial correlations at play inthe 
inverse transition. a, Evolution with temperature of dipole-dipole energy 
density upon heating the ground-state parallel-stripe domain pattern (1) and 
the labyrinthine domain pattern (2). These two curves meet above 200 K, the 
temperature at which the inverse transition occurs. The third curve (3) 
corresponds to what would have been the dipole-dipole energy of domain 
walls if the labyrinthine domain walls would have gradually wiggled with no 


reordering of the stripes (fictive labyrinthine evolution). b, Evolution with 
temperature of the typical size of locally ordered ground-state tiles composing 
the labyrinthine domain pattern. Data were obtained via the analysis of 
structure factors of square patches of varying size at each temperature. 
Specifically, ateach temperature, € corresponds to the maximal patch size 
featuring two-peaked structure factor. Solid line is a guide for the eyes. 
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Extended Data Fig. 6 | Simulations of the inverse transition in thick BiFeO, 
films. a, b, The evolution with temperature of the domain pattern in BiFeO;in 
terms of the distribution of the ferroelectric (a) and AFD (b) order parameters. 
Results were obtained through Monte Carlo simulations using the effective 
Hamiltonian scheme of a 36 x 36 x 10 film subjected to a—0.16% misfit strain, 
with periodic boundary conditions. The system was abruptly quenched from 
2,000 K downto 10K and consequently progressively heated up with 40,000 
relaxation sweeps at each temperature. It can be seen that the distributions of 
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both ferroelectric and AFD order parameters exhibit the inverse transition with 
Tiny ¥1,100 K and 7, ~1,300 K (these numerically predicted temperatures arein 
good agreement with our experimental findings). We find that below 7,,,, the 
system exhibits mixed 109° and 71° domain walls, while above 7;,,, only 109° 
domain walls are observed. Ina, dipoles are coloured according to their z 
component. Inb, AFD vectors are coloured according to the arctan(W,/W,), 
where W, and W, denote the yand x components of the AFD local vectors. 
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Extended Data Fig. 7 | Simulations of the inverse transition in thin BiFeO, 
films. a, b, The evolution with temperature of the domain pattern in BiFeO,in 
terms of the distribution of the dipolar (a) and AFD (b) order parameters. 
Results were obtained through Monte Carlo simulations using the effective 
Hamiltonian scheme of a 36 x 36 x 5 film subjected to a—0.5% misfit strain with 
open boundary conditions, a partial screening at film interfaces (effective 
screening parameter B=0.5). The system was abruptly quenched from 2,000 K 


downto 10K and consequently progressively heated up with 40,000 relaxation 
sweeps at each temperature. It can be seen that the distributions of both 
ferroelectric and AFD order parameters exhibit the inverse transition with 

Tiny ¥525 K and T, = 650 K. We find that the system exhibits 71° domain walls. In 
a, dipoles are coloured according to their zcomponent. Inb, AFD vectors are 
coloured according to the arctan(W,/W,), where W, and W, denote the yand x 
components of the AFD local vectors. 
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temperature in the BiFeO, sample. a—I, Topography, in-plane PFM phase and ‘z-scale’ corresponds to 4nm (a, d,g) and10 nm (j). 
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Extended Data Fig. 10 | Structural properties of the BiFeO; sample before and after annealing. 26-w XRD patterns of the as-grown BiFeO; sample and the same 
sample after the successive annealing up to 1,073 K. a, Full scale. b, Zoom around the (001) peak. 
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Extended Data Fig. 11| Evolution with temperature of the lattice parameter of the BiFeO; sample. Evolution of the out-of-plane parameter upon heating the 
parallel-stripe phase of the BiFeO, sample. Values were obtained by fitting the XRD data and do not reveal any phase transition up to1,160K. 
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Extended Data Fig. 12 | Ferroelectric and elastic domains in the BiFeO, 
sample. Ferroelectric and elastic domain structures in a BiFeO, thin film grown 
ona (110)-oriented DyScO, substrate before and after annealing. a, b, In-plane 
PFM phase images of a BiFeO; thin film for an as-grown sample (a) and asample 
after annealing at 1,073 K for 1h (b). Images are 2 x 2 um*. c,d, Reciprocal space 
mappings around (002) reflections for the same BiFeO, thin film for the as- 
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grown sample (c) and the sample after annealing at 1,073 K for 1h (d). The pink 
arrows indicate the satellite positions to the left and right of the (002) film 
peak. The X-ray beam is aligned at ®=90°, thatis, perpendicular to the stripes. 
The indices of DyScO, and BiFeO; are written in the monoclinic cells. Q,,and Q, 
indicate the in-plane and out-of-plane reciprocal space units, respectively. 
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The proper functioning of living systems and physiological phenotypes depends on 
molecular composition. Yet simultaneous quantitative detection of a wide variety of 
molecules remains a challenge’ ®. Here we show how broadband optical coherence 
opens up opportunities for fingerprinting complex molecular ensembles in their 
natural environment. Vibrationally excited molecules emit a coherent electric field 
following few-cycle infrared laser excitation? ’, and this field is specific to the sample’s 
molecular composition. Employing electro-optic sampling?” ©, we directly measure 
this global molecular fingerprint down to field strengths 10’ times weaker than that of 
the excitation. This enables transillumination of intact living systems with thicknesses 
of the order of 0.1 millimetres, permitting broadband infrared spectroscopic probing 
of human cells and plant leaves. In a proof-of-concept analysis of human blood serum, 


temporal isolation of the infrared electric-field fingerprint from its excitation along 
with its sampling with attosecond timing precision results in detection sensitivity of 
submicrograms per millilitre of blood serum and a detectable dynamic range of 
molecular concentration exceeding 10°. This technique promises improved molecular 
sensitivity and molecular coverage for probing complex, real-world biological and 


medical settings. 


The molecular composition of living organisms is a sensitive 
indicator of their physiological states. Even apparently simple physi- 
ological transitions are often connected to highly multivariate concur- 
rent molecular changes. Therefore, the capability to simultaneously 
observe changes in concentrations of a variety of molecules embedded 
in complex organic consortia is likely to be instrumental in advancing 
biology and medical diagnostics systems. 

Many biologically relevant changes occur at concentration levels 
that are often not detectable in system-wide molecular milieus owing 
tothe vast dynamic range of molecular concentrations’. Simultaneous 
quantitative probing of multiple molecules within a complex con- 
sortium relies on either biochemical separation of certain types of 
molecules or depletion of highly abundant ones". Such approaches 
are time-consuming or expensive or suffer from poor reproducibility, 
impeding robust, high-throughput implementations. Here we harness 
broadband optical coherence to address this challenge directly. 

Optical spectroscopy of biological samples interrogates the chemi- 
cal substructures of intact molecules (molecular fragments”) rather 
than molecules as a whole’®” by detecting their resonant vibrational 
response to infrared or Raman excitation. Occurrence of the same 
or similar fragments in different biomolecules and rapid dephasing 
results in overlapping temporal and spectral responses and hampers 


the identification of individual molecules” *in complex samples. How- 
ever, the detected superposition of the responses of all fragments is 
characteristic of molecular composition, representing what may be 
referred to as the global molecular fingerprint (GMF) of the sample. 
Higher excitation power increases the GMF signal, making smaller 
changes in the sample’s molecular composition detectable. In spec- 
troscopies that capture time-integrated fields"”° 3—thatis, frequency- 
resolved spectroscopy—the GMF signal hits the detector along with the 
(much stronger) excitation transmitted through the sample. This has 
far-reaching implications. First, in the limit of strong excitation, the 
weakest molecular signal detectable tends to be limited by the technical 
noise of the excitation source””*. Second, and more fundamentally, 
even in the absence of technical noise, saturation of the detector (ele- 
ments) places a limit on the sensitivity”. These limitations are sche- 
matically illustrated in Fig. 1a, see ‘Frequency-resolved spectroscopy’. 
In this work, we show how time-resolved sampling of the electric- 
field emitted by impulsively excited molecular vibrations allows us to 
overcome these limitations by isolating the retarded molecular signal 
from any excitation background. We term the technique field-resolved 
spectroscopy (FRS). Sensitive sampling of the isolated molecular signal 
generated by a powerful, ultrashort-pulsed infrared source enables 
broadband transmission spectroscopy of biological systems in their 
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Fig. 1|Infrared FRS. a, Schematic comparison of spectroscopic techniques. 
Infrared light (white bar length indicates source power) with intensity noise 
(technical noise, red hatching) is transmitted through a sample, acquiring GMF 
information (cyan shading). For frequency-resolved spectroscopy, the GMF 
signal is detected ‘ontop’ of the excitation signal transmitted through the 
sample. As a consequence, (1) the GMF signal needs to surpass the excitation 
noise (surviving balanced detection) and (2) enhancing the GMF signal by 
increasing the excitation power is limited by the detector’s dynamic range. For 
FRS, following a few-cycle excitation, sub-optical-cycle nonlinear gating 
isolates ultrabrief fractions of the GMF from any infrared background, 
avoiding both requirement (1) and limit (2); see Methods. b, Infrared electric 
field as reconstructed from the measured electro-optic sampling (EOS) trace 
using an 85-pm-thick GaSe EOS crystal (Supplementary Information section!) 
after transmission through a solution of 10 mg mI‘ DMSO, in water. 


, 


natural, aqueous environment (see ‘Field-resolved spectroscopy 
in Fig. la). 


Field-resolved molecular spectroscopy 

Fourier-transform infrared (FTIR) spectrometers employing ther- 
mal radiation sources” are the gold standard for broadband vibra- 
tional spectroscopy” *7** , In liquid samples, they have detected 


i i 1 i 1 
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Wavenumber (cm) 


1,500 


The reconstructed electric field strongly resembles the EOS signal, owing to 
the broadband instrument response function. The resonant sample response 
is temporally well separated from the non-resonant response (incorporating 
the excitation) and exhibits ‘beating’ of several oscillation frequencies. 

c, Fourier transform of the EOS trace shown inb, truncated at 1.5 ps to exclude 
spectral modulations caused by the echo in the EOS crystal. The solid red line 
shows the spectral intensity, revealing absorption dips associated with 
vibrational modes of DMSO, molecules; the black dashed line shows the 
spectral phase; the cyan line shows the spectral intensity of the signal inthe 
time window 380-1,500 fs, showing time-filtered GMF information. d, Spectral 
detection sensitivity above the detection noise floor (3-ps time window, 25-s 
measurement time, transmission through cuvette filled with water). The solid 
and dashed lines are the bandwidth-optimized versus quantum-efficiency- 
maximized EOS (Supplementary Information section ]), respectively. 


concentration levels downto several micrograms per millilitre?* "°°? *, 


This limitation has so far been overcome only by sample drying” or 
targeted detection with functionalized optical biosensors*”>. 
Recently, tunable quantum cascade lasers***”°6’ and femtosecond 
laser sources»** *° have dramatically enhanced the excitation bril- 
liance. For the reasons sketched in Fig. laand explained in the Methods, 
frequency-resolved spectroscopies have not been able to fully capital- 
ize onthis to achieve improved sensitivity and specificity in molecular 
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Fig. 2| Background quantification for detection of resonant molecular 
responses. a, The red line is the time-resolved magnitude of the EOS signal 
(revealing field oscillations) related to the detection noise floor (signal-to- 
noise ratio), fora reference measurement of pure water (quantum-efficiency- 
maximized detection setting, 37-s effective measurement time). Following the 
excitation, the molecular signal from residual atmospheric background inthe 
beam path is observed. The cyan line is the numerical difference of two 
independent reference measurements. The recorded traces were frequency- 


detection™”’. Here, we show how FRS of few-cycle infrared-laser-excited 
molecular vibrations enables us to take advantage of the temporal 
structure and power of laser-driven few-cycle infrared sources. 
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Fig. 3| Limit of detection of DMSO2 molecules dissolved in water. a, Results 
of the concentration retrieval (see Supplementary Information section IV) with 
quantum-efficiency-optimized FRS (red data points) and FTIR (blue data 
points). The dots indicate the mean values obtained from at least five 
measurements per concentration and the error bars show the absolute 
standard deviation. b, Relative standard deviation for the retrieved values. 
LOD, limit of detection. The coloured shading indicates the range of 
concentrations exceeding the LOD of each instrument. 
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filtered by a20th-order super-Gaussian filter suppressing any noise outside the 
spectral window 900-1,450 cm”. The grey dotted line is the 190-fs (full- 
intensity-width-at-half-maximum duration) ideal Gaussian pulse, for 
comparison. b, Frequency-domain definition of DR, and ¢,. The magnitudes of 
the Fourier transforms of the traces ina are shown for different numerical high- 
pass time filter values. Setting the filter at ¢, (the beginning of the background- 
free time-domain measurement, rightmost panel) yields an electric-field peak 
dynamic range of DR-=1.5 10° around1,140 cm". 


The experimental setup is described in the Methods and in Supple- 
mentary Information section I (see also Extended Data Figs. 3, 4). In 
short, waveform-stable, few-cycle mid-infrared (MIR) pulses abruptly 
excite molecular vibrations by resonant absorption. The sample- 
specific electric field (previously referred to as GMF) emitted in the 
wake of the excitation pulse (Supplementary Video 1 and Methods) is 
detected via EOS’? ® (Fig. 1b, c). The thickness of the electro-optic 
crystal controls a trade-off between the bandwidth and the sensitivity 
of detection (Fig. 1d). 

The nonlinear frequency conversion underlying EOS sequentially 
isolates ultrabrief fractions of the GMF from any infrared background— 
including the excitation pulse transmitted through the sample, and the 
thermal background (see Fig. 1aand Methods). Drawing on preliminary 
experiments”, here we reporta direct measurement of MIR molecular 
electric fields emanating from biological samples. 


Detection of time-gated molecular signals 


In any scheme measuring time-integrated fields, the minimum detect- 
able absorbance, MDA,;», defining the minimum detectable depth of 
the dips in the red line in Fig. Ic, is given by (Supplementary Informa- 
tion section II): 


MDAfp~ 0 (1) 


where orepresents the relative fluctuations of the measured signal in 
the considered spectral element. Here, o incorporates contributions 
from excitation and detection noise, as well as from the limited detec- 
tor dynamic range”. 
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Fig. 4|GMFs of human blood serum and their reproducibility. a, Magnitude of the EOS signals, recorded with quantum-efficiency-optimized FRS (see key). The 
insets show linear-scale representations of the signals depicted in the main panel in two different time windows. b, c, Relative (b) and absolute (c) root-mean- 
square (RMS) of oscillation amplitude and zero crossings of five hundred measurements of the GMF of aserum sample (without sample exchange) (see 


Supplementary Information section V). 


In FRS, temporal isolation of (wave-cycle-scale) fractions of the GMF 
renders the weakest detectable molecular response largely immune 
against the noise of excitation intensity, as is apparent from the cyan line 
in Fig. 1c. This is indicated by the expression for the MDA obtained by time- 
domain modelling of the molecular system with an isolated Lorentzian 
oscillator of dephasing time 7, (Supplementary Information section II): 


(2) 


Here, the dynamic range DR, is defined as the ratio of the spectral ampli- 
tude of the electric field of the overall signal reaching the detector at 
the centre frequency of the Lorentzian oscillator to that of the weakest 


signal detectable after passage through a temporal filter opening at ¢5. 
The parameter ¢, is defined as the instant when the temporal window 
for aninfrared-background-free measurement begins. 

This is the case when the numerical difference between two subse- 
quent measurements (in this case, of liquid water) reaches the detec- 
tion noise floor (Fig. 2a). In our proof-of-principle measurement with 
the quantum-efficiency-maximized FRS setting, this occurs at about 
t, =1,500 fs, yielding a value of DR, in excess of 10° for absorptions 
with centre frequencies between 1,080 cm ‘and 1,190 cm" (for a7-ps 
time window and 37-s effective measurement time, see right panel of 
Fig. 2b). For a dephasing time of the order of a picosecond, typical for 
an aqueous environment’, equation (2) predicts aminimum detectable 
absorbance of the order of 10°. 
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Fig. 5| Sensitivity and specificity of FRS of complex fluids performed with 
bandwidth-optimized sampling. a, Principal component analysis results 
(separation along the 1st principal component) for ahuman blood serum 
sample containing an added aqueous solution of decreasing DMSO, 
concentration, and fingerprinted with FRS using quantum-efficiency- 
optimized detection (left panel) and with FTIR (right panel). The plots show the 
mean and relative standard deviation of the values of the Ist principal 
component for data classes obtained by repeated measurements of samples 
with nominally identical added DMSO, concentration. b, Principal component 
analysis results for a mixture of two sugars dissolved in water with constant 
total concentration and varying relative concentration (see text), and 
fingerprinted with FRS using bandwidth-optimized detection (left panel) and 
with FTIR (right panel). 


For experimental verification, we investigated methylsulfonylmeth- 
ane (DMSO.,) dissolved in deionized water. FRS was benchmarked 
against a state-of-the-art FTIR spectrometer equipped with a thermal 
infrared source (MIRA Analyzer, Micro Biolytics; see Supplementary 
Information section III). With both instruments, at least five aliquots 
of concentrations ranging from 1mg mI‘to 100 ng mI were measured 
over a duration of T= 45s each, witha spectral resolution of 4cm™ 
(realized in FRS by setting the duration of the temporal window of 
measurement equal to 8.3 ps). Reference measurements of solvent only 
(deionized water) were performed in alternating order. The concen- 
tration values retrieved from the measured data (see Supplementary 
Information section IV) are summarized in Fig. 3. The limit of detec- 
tion is defined as the concentration retrieved with a relative standard 
deviation of 100%. Our study yields an FRS limit of detection of 
200 ng mI, by a factor of 40 lower than that obtained with the FTIR 
spectrometer (8 pg mI”). This is in agreement with the prediction of 
equation (2); see Supplementary Information section IV and Extended 
Data Fig. 7. We estimate a limit of detection of approximately 7 pg mI 
for Fourier-transform spectroscopy (FTS)” performed with our coher- 
ent infrared source and state-of-the-art infrared photodetectors 
(see Methods). 

The exponential dependence of the detection limit on ¢, in equa- 
tion (2) emphasizes how FRS is fundamentally different from any fre- 
quency-domain spectroscopy, where ¢, is irrelevant (See also Methods). 
To investigate this dependence—and thereby this hitherto unexplored 
advantage—we repeated the DMSO, dilution series measurement with 
shorter, sub-60-fs infrared excitation pulses (Supplementary Informa- 
tion section I) and the bandwidth-optimized detection setting of the 
FRS instrument (Fig. 1d, continuous line). This combination substan- 
tially improved the opening time for background-free detection to 
tz,=450 fs (Supplementary Information section IV). The improvement 
came at the expense of a factor-of-ten reduction of DR, (Fig. 1d). This 
reduction would, in its own right, result in a factor-of-ten increase of 
the minimum detectable concentration, according to equation (2). 
By contrast, we observe an increase from 200 ng mI‘ to 450 ng mI! 
only, mainly due to shortening ¢, from 1.5 ps to 0.45 ps (Supplementary 
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Information section IV). This corroborates the predicted sensitivity 
Of MDA gps tO fp. 

Amore powerful broadband few-cycle infrared source” will improve 
DR, while preserving the full bandwidth along with the reduced ¢,. This 
holds promise for a detection limit below 50-ng mI in combination 
with super-octave spectral coverage. 


Attosecond-timed molecular signals 


For the investigation of complex molecular consortia, the sensitiv- 
ity and specificity of FRS-based molecular fingerprinting depends 
critically onthe temporal coherence of the GMF signal and its reproduc- 
ibility over extended measurement time. In gas-phase samples, vibra- 
tional dephasing occurs onthe nanosecond scale and the required long 
acquisition delays are advantageously realized with two asynchronous 
femtosecond oscillators””'**“*, harnessing optical frequency-comb 
techniques*“*. By contrast, in the liquid phase the coherent molecular 
signal survives only for several picoseconds’. To efficiently use meas- 
urement time and ensure attosecond delay precision, we implemented 
waveform sampling with a mechanical delay line equipped with inter- 
ferometric delay tracking*’. Figure 4a shows the field-resolved GMF of 
ahuman blood serum sample, as representative of a cell-free bioliquid 
routinely used in biomedical profiling. The insets in Fig. 4a, b show 
the differential GMF of the biomolecular ensemble inthe sample, as a 
result of subtracting the signal obtained from pure water fromthe one 
of the sample. This ‘pure’ biomolecular signal decays by a few orders 
of magnitude within 5 ps (compare the left and right panels in Fig. 4b), 
revealing a dephasing time of collective biomolecular vibrations in 
human blood serum far below 1 ps. 

Five hundred consecutive measurements of the same serum sam- 
ple yield a relative root-mean-square deviation of the field oscilla- 
tion amplitude from its mean value of around 0.2% and an absolute 
root-mean-square of the zero crossings of the infrared GMF field in 
the range of 20 as, within the first two picoseconds following the exci- 
tation (Fig. 4c, d). It is this reproducibility that enables suppression 
of the electric field background by up to three orders of magnitude 
via comparison with a reference field (Figs. 2a and 4a), opening the 
window for background-free measurement less than 2 ps after the 
excitation pulse peak, even in a highly complex sample such as blood 
serum (Fig. 4a, magenta line). 


Sensitivity and specificity of FRS 


Inreal-world applications” *”°”’, molecular fingerprinting of complex 
biofluids will need to probe miniscule changes in the sample’s chemical 
composition, often caused by low-abundance molecules. The method’s 
utility for biological or medical applications will be greatly dependent 
on the smallest changes in molecular concentration that can cause a 
detectable distortion of the field-resolved GMF. To assess this concen- 
tration level, we added controlled amounts of DMSO, to the serum 
sample fingerprinted in Fig. 4a. The results of a principal component 
analysis of the infrared fingerprints of these samples, measured with 
our FRS and FTIR devices (Supplementary Information section VI and 
Extended Data Fig. 8) are shown in Fig. 5a. The plots show the mean and 
the spread of the data classes of repeated measurements of samples 
with different concentrations of the added molecule, along the first 
principal component. FRS appears to clearly separate the sample con- 
taining additional DMSO, molecules at aconcentration of 500 ng mI" 
fromthe reference sample. Moreover, the error bars suggest that FRS 
is capable of detecting changes in molecular concentration down to 
the 200 ng mI level in human blood serum, an improvement of 
nearly an order of magnitude compared to state-of-the-art FTIR 
spectrometry. 

Hence, the smallest changes currently detectable are more than 
five orders of magnitude below the concentration of the most highly 
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Fig. 6| FRS of strongly absorbing living systems. a, The blue-outlined (left) 
panelis an optical microscope image of cultured human THP-1 cells. The green- 
outlined panel (right) shows the top and lateral views of aleaf from Salix caprea. 
The measurement of the intact hydrated leaf was performed 5 min after 
collection, within the marked area. b, The upper panel shows GMF of THP-1 cells 
in suspension, contained ina 100-pm-thick cuvette (blue line) referenced by 
numerical subtraction to the signal of the suspension medium (phosphate- 
buffered saline, PBS; grey line). The lower panel shows the molecular response 
obtained after transmission througha120-pm-thick leaf of Salix caprea (green 


abundant molecules of blood serum, albumin’. This implies a detect- 
able concentration dynamic range in excess of 10°. 

Although the relative intensity noise of the excitation does not affect 
the FRS limit of molecular detection with a spectrally isolated feature, 
the lowest detectable concentration of the same molecule ina complex 
environmentis limited by the relative intensity noise of the overall GMF 
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line) with air reference (grey line). c, Absorption (top panel) and phase (lower 
panel) spectra of five measurements of human THP-1 cells (blue lines) along 
with the amplitude and phase of temporally-filtered GMFs (magenta lines). 
Absorption and phase spectra of the plant leaf are shownina. The standard 
deviations of multiple measurements inc and dare indicated by the shaded 
areas (see Supplementary Information section VII for data processing). We 
note that the error corridor of the measurement ind is smaller than the line 
thickness and therefore not visible. The grey dotted lines inc and d indicate 
prominent absorption peaks. 


signal. This, in turn, is likely to be dominated by the noise of the excitation 
source. As an important consequence, the current FRS concentration 
dynamic range of 10° offers substantial room for further improvement 
by suppressing the noise of the GMF signal. An efficient measure to this 
end may be ‘freezing’ the excitation source noise by scanning faster 
than the characteristic time of low-frequency intensity fluctuations”“’. 
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Totest the specificity of the measured fingerprints, that is, the sensi- 
tivity to small changes in relative concentrations, we prepared aqueous 
solutions of two different sugar molecules of constant total concentra- 
tion and varying relative concentrations (Supplementary Information 
section VI). The total concentration of 100 jg mI was chosen to be 
well above the limit of detection of both instruments. To challenge the 
method, we used two molecules, maltose and melibiose, which have 
very similar absorption characteristics (Supplementary Information 
Section VI and Extended Data Fig. 9). The data in Fig. 5b reveal that FRS 
outperforms FTIR spectrometry in sensing not only small absolute 
changes but is also sensitive to relative changes in concentration of 
molecules of a complex ensemble. 


Probing of intact biological systems 


Non-invasive, quantitative probing of intact biological systems would 
benefit a diversity of biological, biomedical, pharmaceutical and eco- 
logical applications. To circumvent sensitivity limitations caused by the 
strong absorption of infrared radiation in liquid water, so far the majority 
of studies of biological matter have drawn on sample preparations”**>”” 
that substantially alter the state of the sample (suchas drying, fixation, 
slicing, chemical extraction, homogenization and so on). Direct inter- 
rogation of intact living systems with infrared spectroscopy has been 
limited to interaction lengths of the order of 10 um (or less), either in 
attenuated-total-reflection geometry” or by using extremely thin micro- 
fluidic cuvettes”. Bothimplementations prevent the majority of living 
cells from being studied in vivo (for example, human cells are on average 
larger than 10 pmin diameter). More recently, quantum-cascade lasers 
have enabled infrared transmission measurements of living systems with 
path lengths of several tens of micrometres, albeit with restrictions on 
the bandwidth and with modest signal-to-noise ratios**””. 

The unparalleled dynamic range of FRS implemented with a powerful 
few-cycle infrared source enables these restrictions to be overcome. 
Here we present the feasibility of infrared fingerprinting of living 
human cells (THP-1 leukaemic-monocyte-like cell line) cultured and 
measured directly in suspension (Fig. 6a, left panel) by transillumina- 
tion of a0.1-mm-thick flow-through cuvette (see also Supplementary 
Information section VII). In spite of the order-of-magnitude increase in 
interaction length as compared to previous broadband measurements 
of cells from the same cell line*’, the differential signal originating from 
the molecules of the cells (blue line in Fig. 6b) is acquired with a high 
signal-to-noise ratio (Supplementary Information section VII). The 
corresponding absorption and phase spectra are depicted in Fig. 6c 
(blue lines), with the former reflecting well the spectral signatures 
featured by THP-1 cells when squeezed into a 7-jum-thick cuvette”. Tem- 
poral gating of the molecular signal (magenta lines in Fig. 6c) uncovers 
the splitting of the absorption lines at approximately 1,080 cm™ and 
1,230 cm“, along with relevant phase oscillations—features that are not 
apparent in the time-integrated spectra (blue lines). This underlines 
the power of isolating the molecular signal from an (inherently) noisy 
excitation, offered by FRS. 

We have further tested the ability of FRS to acquire transmission 
spectra of strongly absorbing samples by transilluminating intact 
plant leaves from the goat willow (Salix caprea),a common deciduous 
tree, with a thickness of approximately 120 um (Fig. 6a, right panel). 
The spectra in Fig. 6d feature clearly discernible absorption bands at 
1,050 cm“, 1,078 cm and 1,103 cm“, corresponding to the C-O stretch- 
ing motion characteristic of carbohydrates’”° widespread in cell walls 
and cellular compartments of plant leaves. The spectrally resolved 
attenuation ranges from 5 to 8 orders of magnitude, which is orders 
of magnitude higher than previously demonstrated in a broadband 
infrared transmission measurement. In addition, it shows the instru- 
ment’s ability to resolve absorption over several orders of magnitude 
in strength without the need to adjust the light power reaching the 
detector”. 
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Conclusions and outlook 


We have measured infrared-electric-field molecular fingerprints of 
organic molecules in aqueous solution and in human blood sera. In both 
settings, the limit of detecting changes in concentration of individual 
molecules lies in the range of hundreds of nanograms per millilitre 
for less than one minute of data acquisition time. The amplitude of 
the coherent emission carrying the GMF of human blood serum was 
observed to decay by a few orders of magnitude within a few picosec- 
onds. The reproducibility of electric-field oscillations was found to be 
inthe range of tens of attoseconds over atemporal span exceeding six 
picoseconds following the excitation. 

These findings emphasize the performance of FRS of impulsively 
excited molecular vibrations for GMF of complex biofluids and uncover 
potential for its further improvement. First, the extremely fast (much 
less than a picosecond) decay of vibrational coherence in human blood 
serum suggests an exponential improvement of the detection limit with 
further steepening of the temporal decay of the excitation transmitted 
through the sample. Second, the coherence of the recorded molecular 
signal over spans of several picoseconds along with reduced source- 
noise-induced GMF noise, by rapid scanning*, for example, will increase 
the detectable range of concentrations in biofluids. The capability of 
simultaneous probing of multi-molecular changes over a dynamic range 
of detectable concentration changes in excess of 10° holds promise for 
applications in the life sciences and medical diagnostics. 

Last, broadband infrared fingerprinting of physiologically relevant 
living human cells is now feasible in transmission, opening the door 
for combining infrared fingerprinting with standard flow cytometry. 
The unparalleled dynamic range of FRS implemented with powerful 
few-cycle light promises a new regime of transmission-mode vibra- 
tional spectroscopy and spectro-microscopy of intact living systems: 
individual biological cells, bulk-cell and tissue cultures, organs such 
as plant leaves—all settings in which excessive water absorption has so 
far constituted a major obstacle. 
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Methods 


Nonlinear time-domain gating in FRS 

Here, we elucidate the qualitative differences between FRS and tradi- 
tional, frequency-resolved spectroscopy. For the latter, we choose FTS 
as the perhaps most advanced form of frequency-resolved infrared 
spectroscopy, in particular in the dual-frequency-comb implementa- 
tion" !2, Furthermore, the interferograms obtained by FTS performed 
either with ultrashort pulses”! or with broadband, incoherent light” 
resemble the electric field emerging from asample after resonant exci- 
tation with a few-cycle infrared pulse, which FRS samples with sub- 
optical-cycle resolution by means of nonlinear optics (see Fig. 1b). To 
understand the important performance differences between the two 
techniques, itis essential to recognize the conceptual differences in the 
acquisition of these time-domain signals. First, using simple formalisms 
for the signals acquired in FTS and FRS, we reveal two major advan- 
tages introduced by the time-domain, nonlinear-conversion-based 
gating of the sampled electric field in FRS over FTS: the robustness 
of detection sensitivity against technical noise of the MIR excitation 
transmitted throughthe sample, and the mitigation or circumvention of 
the detector-dynamic-range limitation of sensitivity inherent to FTS”. 
Then, we evaluate the performance of FTS achievable with our coherent 
infrared source and state-of-the-art infrared detection (both described 
in Supplementary Information section 1), employing a well established 
frequency-domain formalism”. Contrasting the results with those 
of FRS presented in this work, we observe detection sensitivities 
higher by more than a factor of 30 for FRS of impulsively excited 
molecular signals decaying with a time constant on the order of 1 ps, 
as is typical for liquid-phase samples—owing to the above-mentioned 
advantages. 

Extended Data Fig. la illustrates the working principle of FTS. Here, 
we consider an ultrashort-pulsed MIR excitation source. Its broad- 
band pulses are sent along two arms of an interferometer, one of which 
contains the sample and one of which acts as a ‘local oscillator’ for 
homodyne (or heterodyne) detection. The field transmitted through 
the sample isthe convolution of the sample response with the incident 
excitation field” F.,(¢). It can be written as the sum of (1) anon-resonant 
response representing an attenuated (and temporally altered) version 
of £.,,(t), which for simplicity we approximate here as a£,,(t), witha 
scalar a <1, and (2) the response F,y,-(t) of the resonantly excited mol- 
ecules (a more rigorous treatment of the sample response is given in 
Supplementary Information section II). The field R,.(t-7) in the local 
oscillator armis a copy of £.,(t), delayed by a variable time tT. FRS imple- 
mented with EOS (Extended Data Fig. 1b) employsa near-infrared (NIR) 
gate pulse £,(t-r) fulfilling two functions” (see also Supplementary 
Information section 1). First, this pulse ‘carves out’ an ultrashort por- 
tion of the sample response, for instance via a second-order nonlinear 
upconversion process. Second, it acts as a local oscillator inthe homo- 
dyne/heterodyne detection of this upconverted signal. 

In both schemes, at each delay Tt, the superposition of the sample 
response (time-gated and upconverted in the case of FRS) and local 
oscillator fields is sent to (usually two) t-integrating intensity detec- 
tors placed at each of the sum and difference ports of the beam com- 
biner. In the wake of the excitation, where the strength of a£,,(0) can 
be neglected against that of E,,-(¢), the resulting signals recorded by 
the two respective detectors read: 


lers,,2(0)= J laEex(0) + Ecur(P de + f Efo(e- nde 


(1a) 
+2 Eoup(OEo(t~1)de 


Iens.260) = J [XE g(t DEcwe(OPde+ J Es(e—1)de 


(1b) 
#2 yEoue(E{(¢~1)de 


where x, (t - T)Ecur(2) is a qualitative expression for the time-gated, 
upconverted sample response in FRS, neglecting effects such as phase 
matching or depletion/saturation. The first two right-hand-side terms 
of equation (la, b) represent a background (direct-current baseline) 
around which the third term, containing the spectroscopic informa- 
tion, oscillates. A major difference stems from the first background 
term in the two equations and immediately becomes apparent after 
two approximations. In equation (1a), this term can be approximated 
by flak, (t)P-de, whichis typically orders of magnitude larger than the 
(time-integrated) GMF signal. In equation (1b), owing to temporal 
gating, the first right-hand-side term is orders of magnitude 
smaller than the other two terms (see Extended Data Fig. 1c), and can 
be neglected. With these two approximations, equation (la, b) 
becomes: 


Irrsa2(t)® f [a,,(OPde +f ER (e- 1de#2f Eoyr(OE,olt- nde (2a) 


legs,12(T) J Exe- ndex2f yEoue(OEXe- de (2b) 

The fact that in FTS the time-integrated excitation transmitted 
through the sample always impinges on the detector(s), whereas in 
FRS this background term is negligible in the wake of an impulsive 
excitation, illustrated by equation (2a, b), has two far-reaching implica- 
tions, described as follows. 


Robustness of FRS against excitation noise. Although for both 
schemes the contribution of the local-oscillator term to the back- 
ground canbe readily reduced to the shot-noise/detector-noise level, 
for example, via lock-in detection (see Supplementary Information 
section 1), in FTS the minimum detectable molecular signal is directly 
affected by the technical noise of the MIR excitation, whose contribu- 
tion to the recorded signal is constant along the entire delay range. 
This requires its suppression by sophisticated fast scanning methods” 
and/or balancing techniques. In spite of all these efforts, photon 
quantum-noise-limited sensitivity has not been experimentally dem- 
onstrated for broadband measurements for wavenumbers shorter 
than 2,000 cm‘, to the best of our knowledge. In FRS, by contrast, 
excitation-background-free detection of the molecular signal in the 
wake of an impulsive excitation implies a sensitivity that is ultimately 
limited by the quantum noise of the NIR gating field but largely immune 
to the noise of the MIR excitation. 


Circumvention or mitigation of detector-dynamic-range-induced 
sensitivity limitation. In FTS, the usable input power is restricted 
by the excitation, transmitted through the sample, saturating the 
detector(s); see the first right-hand-side term of equation (2a). This 
implies a severe detector-dynamic-range-induced sensitivity limit” 
that can only be circumvented/mitigated by techniques such as spec- 
tral multiplexing” or building the difference between a sample and 
areference response to the same excitation interferometrically, be- 
fore detection®”*. This adds substantial complexity to any detection 
scheme and has not been widely used so far. In FRS, for a fixed local- 
oscillator power (set to be below the detector saturation level), the 
signal-to-noise ratio can readily be increased by increasing the exci- 
tation field, which linearly increases the sought-for molecular signal 
Egue(6) in the third right-hand-side term in equation (2b). Because the 
excitation signal transmitted through the sample is eliminated by the 
femtosecond temporal gate, the molecular signal can, in principle, be 
increased up to levels at which aE,,(¢) vastly exceeds the saturation 
level of any available detector. 


Sensitivity estimation of FTS implemented with our infrared source 
Here, we calculate the expected sensitivity for an FTS implementa- 
tion employing our infrared radiation source and state-of-the-art MIR 


detectors. Because of the delay-independent contribution of excitation 
noise to the recorded signal (see above), time-domain filtering of the 
recorded signal does not have sucha dramatic effect as in FRS, and well 
established frequency-domain models for FTS lend themselves for a 
sensitivity estimation. Here we use the model of Newbury et al.”* who 
derived an expression for the frequency-domain signal-to-noise ratio in 
dependence of detector noise, shot noise, excess laser relative intensity 
noise (RIN) and detector dynamic range. Although the formula was 
derived for dual-comb spectroscopy, it can be readily applied to FTS 
with (slow) mechanical scan, with our experimental parameters (see 
Supplementary Information section I, Extended Data Fig. 5 and sum- 
mary in Extended Data Table 1). In addition, we assume no limitations 
due to digitization, no sequential or parallel multiplexed acquisition 
and a duty cycle of 1. The power level in both the signal and the local 
oscillator arms was set to 0.45 mW, limited by detector saturation and 
well within the range of our source. 

For direct comparison with our FRS results, we consider the absorp- 
tion of DMSO, solved in water, spectrally centred at 1,139 cm™ (see 
Extended Data Fig. 6 and parameters in Extended Data Table 1). Accord- 
ing to equation (4) of ref. ”, for these parameters we obtain a limit of 
detection of 7 pg ml of DMSO, dissolved in water for FTS, which is a 
factor of 35 above what is demonstrated here with FRS. 


Experimental setup 

The instrument (see also Supplementary Information section I fora 
detailed description) is based on a Kerr-lens mode-locked thin-disk 
Yb:YAG oscillator*’ emitting a 28-MHz repetition-rate train of 220-fs 
pulses, spectrally centred at 1,030 nm. After temporal compression via 
nonlinear spectral broadening based on multi-pass self-phase modu- 
lation in bulk fused silica followed by chirped-mirror compressors”, 
the resulting NIR pulses are 16 fs long, with an average power of 60 W. 
These pulses drive intrapulse difference-frequency generation (opti- 
cal rectification) in a 1-mm-thick LiGaS, crystal. The emerging MIR 
radiation with an average power of the order of 100 mW is spectrally 
tunable witha coverage of nearly one octave around acentral frequency 
of 1,200 cm”. After the crystal, the NIR pulse is recycled and used for 
gating in the EOS detection of the MIR waveforms. Balanced detection 
in EOS is optimized close to the NIR shot-noise limit, with an imping- 
ing NIR power on the GaSe EOS crystal of 420 mW. In order to reduce 
phase artefacts introduced by variations of the mutual delay between 
the MIR sampled wave and the NIR sampling pulse, we track this delay 
interferometrically, with an additional continuous-wave laser”. In this 
manner, datacan be recorded with few-nanometre delay precision and 
atemporal duty cycle close to 100% during forward as well as backward 
scans. Starting with the last NIR pulse compression stage, all the beams 
are enclosed in vacuum chambers at a background pressure in the 
1-mbar range. Further measures of stabilization include an acousto- 
optical-modulator-based active noise eater” and lock-in detection 
employing mechanical chopping of the MIR beam. 


Dynamic range of FRS 

The 500-um-thick GaSe electro-optic crystal constitutes a trade-off 
between a high quantum efficiency and broad bandwidth (Fig. 1d). In 
addition, it avoids internal reflections within the measurement time 
window. This quantum-efficiency-optimized apparatus resulted in 
a linearity of the instrument response over more than seven orders 
of magnitude of electric-field strength and, moreover, the intensity 
dynamic range scales linearly with measurement time (Extended Data 
Fig. 2). Thus, sampling of the oscillating electric field rather than its 
cycle-averaged intensity results in an unprecedented linear-response 
intensity dynamic range of >10", vastly exceeding that of infrared spec- 
troscopy so far, to our knowledge’. This enables transillumination of 
aqueous samples of several tens of micrometres in thickness while 
maintaining a high signal-to-noise ratio. 


Measurement principle and the nature of the signal 

FRS molecular fingerprinting relies on the generation of ultrashort 
infrared pulses with identically repeating electric-field waveforms (in 
our setup, 28 million such pulses per second). These pulses are transmit- 
ted through the sample under investigation, and the waveforms emerg- 
ing from this interaction are recorded with EOS (see Supplementary 
Information section 1). The spatial distribution of microscopic electric 
charges (thatis, electrons and nuclei) in organic molecules is (1) inhomo- 
geneous and (2) characteristic of the molecular species. Because of (1), 
when the electric field of the above-mentioned infrared pulses interacts 
with the molecules, it induces microscopic spatial charge separations 
(due to the existence of electric dipole moments). These charge sepa- 
rations evolve in time, driven by the oscillating electric field. Because 
of (2), these microscopic charge oscillations occur with characteristic 
magnitudes and frequencies—albeit having a fixed mutual timing, set by 
the commonexcitation field. In particular, resonant vibrations oscillate 
long after the excitation by the few-cycle infrared waveform, emanat- 
ing a GMF. This resonant response is the coherent superposition of 
the fields of all sample-specific oscillations, thus containing most of 
the sample-specific information. Importantly, at the centre frequency 
of any such oscillation, the emission of light as a consequence of the 
resonant excitation by a light field occurs with opposing phase to the 
latter’. Consequently, the coherent superposition of the GMF and the 
excitation transmitted through the sample results in a destructive inter- 
ference at these frequencies, leading to the typical ‘absorption dips’ 
observed in frequency-domain spectroscopy; see Fig. 1c. 


Data availability 


The data that support the findings of this study are available from the 
corresponding author upon reasonable request. 
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b, Schematic of FRS. c, Portions of the background signal contributed by the efficiency was set to land the ‘carved out’ effective window time length to 50 fs 

sample response tothe FTS (blue, first right-hand-side term of equation (1a)) (without loss of generality). Example parameters: 190-fs Gaussian excitation 


and tothe FRS (red, first right-hand-side term of equation (1b)) signals at a fixed pulse and1,139-cm DMSO, absorption (see Extended Data Table 1). 
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recorded with attenuating optical density (OD) filters instead of the cuvette in measurement) and phase of the signals ina, respectively. The detection noise 
the beam path, for increasing attenuation and measurement time 7. A1,200-fs floors in b were obtained by blocking the MIR signal and evaluating the mean of 
scan range and 7=16s and 7=1,600s were considered. Small variations of the the (white) noise in the considered spectral range, and confirm the linear 
pulse shape for different attenuations are attributed to slight dispersion decrease of the noise floor with 7. For the data inc, for all time-domain 


variations among the OD filters. The attenuation-independent pulse shape waveforms a super-Gaussian filter (width 700 fs, order 20) was applied. 
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beam path (see text). The pulse was temporally compressed with customized 
dispersive optics. Pulse compression. EOS traces of the excitation pulse 
transmitted through water in the bandwidth-optimized instrument setting, 
with (blue) and without (red) four dispersive mirrors in the MIR beam path. 

c, As inb but ona logarithmic scale, visualizing the improved roll-off of the 
signal achieved with the dispersive optics. 
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Extended Data Fig. 4| Contributions to quantum efficiency in electro-optic 
sampling. a, Frequency-resolved measurement of the noise of the balanced 
detection (black), and calculated shot noise (red). The dashed line indicates the 
lock-in frequency, and its peak stems from the chopper. b, Comparison of MIR 
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power depletion after EOS crystal for the two different crystal thicknesses. The 
oscillations originate from interferences of the MIR pulse incident to the EOS 
crystal and MIR radiation generated therein (these oscillations do not affect 
the performance of EOS detection). 
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Extended Data Fig. 5 | Measurement of noise contributions for the 
estimation of the performance of FTS with our femtosecond-laser-based 
source, our mechanical scan, and state-of-the-art infrared detection. a, The 
setup mimics aFTS setup in the Mach-Zehnder configuration, with balanced 
lock-in detection. For lock-in frequency modulation, amechanical chopper is 
placedinthe ‘sample arm’. The two arms are recombined with a50:50 beam 
splitter. The two outputs are detected with two independent MIR detectors 
(see text for details). The power impinging on each detector was limited to 
450 mW, corresponding toa detector output voltage of 20 V. The relative 
intensity noise (RIN) spectrum of the source is recorded with an FFT-Analyzer 
in the range 0.1-100 kHz (before balanced detection). Balanced lock-in 


detectionis performed witha lock-in amplifier with differential input. The 
beam block was used in the measurements shown inc. b, RINspectrum of the 
free-running (red curve) and intensity-stabilized (blue curve) MIRbeam 
(before the interferometer). The integrated RIN of the stabilized source from1 
Hzand100 kHzisas lowas 0.04%. c, Demodulated (after lock-in detection with 
atime constant of 1.6 ms and 4th-order filter) time-domain trace of detector 
noise (grey), local-oscillator signal with sample arm blocked (turquoise) and of 
the combination of both interferometer arms impinging onthe balanced 
detection (blue). The inset shows a 1-second section of the signals, fora 
detailed comparison of the local-oscillator noise and the detector noise. 
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Extended Data Fig. 6 | Simulations of time-domain decay of a molecular 
Lorentzian oscillator. a, Fit of aLorentzian oscillator to the 1,139 cm™ 
absorption of (low-concentration) DMSO,. Black line, intensity transmission 
through pure, molecular DMSO,, determined by referencing the transmission 
spectrum ofalmg mI‘ solution to that of water, measured via FTIR, and 
normalizing to a1-um path. Green line, least-squares fit (1,080-1,190 cm”) ofa 
Lorentzian oscillator to the 1,139 cm“ absorption, yielding a full width at half 
depth of 13.47 cm ‘and an absorption coefficient a=11.96 cm‘. The numerical 
example shows the instantaneous and resonant parts of the electric field as 
described by equations (1) to (4) in Supplementary Information section II. The 
initial pulse is a Gaussian pulse with an intensity envelope (full width at half 
maximum) of 190 fs. The Lorentzian absorption band has a peak of a,z with 
a,= 0.0024 cm", corresponding to a200 ng mI solution of DMSO, in water, 
and a width 6u=13.47 cm”. These values were obtained from fitting a 
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to C. At the absorption maximum, the discrepancy between the resonant 
response asin Supplementary Information section 2 and its approximation as 
in Supplementary Information section 3 is 1%, justifying this convenient 
approximation. The error introduced by band-pass filtering the resonant 
response between 1.5 ps and 4 ps compared to the high-pass time-filtered 
signal is 4%. 
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Extended Data Fig. 7 | Spectral intensity of the Fourier-transformed 
temporal fingerprints of DMSO,. Spectral intensity is shown for different 
concentrations, after high-pass-time-filtering at ¢, = 1,500 fs and subtraction of 
pure water reference, normalized to the spectral intensity of the reference 
pulse. Green dashed lines, modelled Lorentzian oscillator with the parameters 
derived from the fit in Extended Data Fig. 6. This model agrees excellently with 
the measured fingerprints, and confirms the minimum detectable absorbance 
predicted by equation (2) as well as the linear response of the instrument. 
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Extended Data Fig. 8 | Principal component analysis. a~d, Comparison of the 
loading vectors for the first principal component for the FTIR data (a) and the 
FRS data (b) from the serum spiking experiment, with the pre-processed GMF 
data (see text) of the FTIR (c) and FRS (d) measurements of almg mI“? DMSO, 
solution. We note that the FRS spectra are complex, so the real and imaginary 
parts were considered separately (and stitched to single vectors). e, Figure of 
merit (FOM) (colour scale in arbitrary units; see Supplementary Information 
section VI) quantifying the separation of classes according to the first principal 
component (the lower the FOM, the better the separation), evaluated for a large 


range of the beginning time ¢, and time window length At. The cross indicates 
parameters yielding optimum separation. f-i, Comparison of the loading 
vectors for the first principal component for the FTIR data (f) and the FRS data 
(g) from the sugar mixture experiment, with the pre-processed GMF data of the 
FTIR (h) and FRS (i). For the latter, the difference of the spectra of the 50/50 
mixture and the pure maltose solution is shown. The real and imaginary parts 
were considered separately. j, FOM quantifying the separation of classes 
according to the first principal component, inanalogy toe. 
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Extended Data Fig. 9 | Absorption spectra of 10 mg mI“ aqueous solutions of maltose and melibiose, measured by FRS and FTIR. The difference in total 
absorptionis due to the differing cuvette thickness. a, FRS; b, FTIR. OD, optical density. 


Extended Data Table 1| Parameters for numerical estimation of the sensitivity of FTS implemented with our infrared source 


Parameter Quantity Comment 
Central wavelength Ac = 8.5 um 
Spectral width AVewum = 180 cm7+ 
MIR power (FTS) Pro = Ps = 0.45 mW Maximum incident power limited by detector saturation 
MIR detector noise NE Pycr = 2.5 pW/H2°% e.g.: InfraRed Associates; MCT-13-1.00 
Relative intensity noise RIN = 2.7 x 10° 1/Hz°*® 
FTS detector efficiency Ners = 1 Quantum efficiency is not stated in the detector datasheet 
Measurement time T=37s Measurement time for sample and reference measurement 
Spectral resolution Vres = 4.7cm71 This corresponds to a scan time window of 7 ps 


DMSO, absorption line at vpys9, = 1139 cm™! 


Absorptivity Apmso, = 12.92 cm} For 1 mg/ml DMSO, solution 


1 


Line width Vewum = 13.47 cm™ This corresponds to a dephasing time 7, of ~770 fs 
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The formation and growth of water-ice layers on surfaces and of low-dimensional ice 


under confinement are frequent occurrences’ *. This is exemplified by the extensive 


reporting of two-dimensional (2D) ice on metals 


5-11 12-16 


, insulating surfaces’ ”°, graphite 


and graphene” and under strong confinement*’ ~”, Although structured water 
adlayers and 2D ice have been imaged, capturing the metastable or intermediate edge 
structures involved in the 2D ice growth, which could reveal the underlying growth 
mechanisms, is extremely challenging, owing to the fragility and short lifetime of 
those edge structures. Here we show that noncontact atomic-force microscopy witha 
CO-terminated tip (used previously to image interfacial water with minimal 
perturbation)”, enables real-space imaging of the edge structures of 2D bilayer 
hexagonal ice grown ona Au(111) surface. We find that armchair-type edges coexist 
with the zigzag edges usually observed in 2D hexagonal crystals, and freeze these 
samples during growth to identify the intermediate edge structures. Combined with 
simulations, these experiments enable us to reconstruct the growth processes that, in 
the case of the zigzag edge, involve the addition of water molecules to the existing 
edge and acollective bridging mechanism. Armchair edge growth, by contrast, 
involves local seeding and edge reconstruction and thus contrasts with conventional 
views regarding the growth of bilayer hexagonal ices and 2D hexagonal matter in 


general. 


Scanning tunnelling microscopy (STM) has been widely used to study 
2D ices at surfaces””””, but resolving edge structures is difficult because 
STMis not sensitive to the position of nuclei and its tip can induce distur- 
bances. Although transmission electron microscopy (TEM) can resolve 
atomic lattice edges”, high-resolution TEM usually requires high-energy 
electrons that can change or even completely decompose the edge struc- 
ture of covalently bonded 2D materials” and are expected to damage 
more weakly bonded ice edges. By contrast, noncontact atomic-force 
microscopy (AFM) based ona qPlus sensor”*” can probe interfacial 
water with excellent resolution’””°”, with use of a CO-terminated tip 
ensuring that water molecules are only minimally disturbed thanks to 
the ultrahigh flexibility of the tip apex and the weak higher-order elec- 
trostatic force”. Here we use this method to image various metastable 
edge structures of a 2D bilayer hexagonal ice grown ona Au(111) surface 
(Fig. 1a) and resolve the growth mechanisms with atomic detail. 

The 2D ice was grown ona Au(111) surface at about 120 K with a thick- 
ness of around 2.5 A (see Methods, Fig. 1a), corresponding to two water 
overlayers (Extended Data Fig. la-f). The STM image of the 2D ice 


(Fig. 1c) and the corresponding fast Fourier transform (FFT) image 
(inset of Fig. 1a) both show a well ordered hexagonal structure, with 
periodicity® Au(111)-./3 x 3 -30° (Wood’s notation; Extended Data 
Fig. 1g). Although the honeycomb H-bonding network of the 2D ice is 
visible in the STM image, the detailed topology of the edge structures 
is difficult to resolve. The AFM frequency-shift (Af) image of the same 
island exhibits much higher resolution (Fig. 1d), such that the atomic 
structures of the zigzag and armchair edges can be easily identified. 
The total length of the zigzag and the armchair edges are comparable, 
but the average length of the former is statistically somewhat larger 
(two-sided t-test, P=1* 107; Fig. 1b). Zigzag edges can grow perfectly 
up to lengths of 60 A, but armchair edges are always interrupted by 
step kinks or defects that result in shorter lengths, predominantly 
around 10-30 A (Extended Data Fig. 2). 

We then performed systematic AFM imaging at different tip heights 
(see Methods and Fig. 2a). Ata large tip height, where the AFM signals 
are dominated by the higher-order electrostatic force”, we can distin- 
guish twosets of /3 x 3 sub-lattices in the 2D bilayer ice, one of which 
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Fig. 1| Experimental setup and STMand AFM images of 2D bilayer ice. 

a, Schematic of STM and AFM imaging of a 2D bilayer ice island on Au(111) using 
qPlus-based non-contact AFM witha CO-terminated tip. Inset, 2D FFT image 
inside the 2D ice island. The line profile across the step edge shows the height of 
the island (about 2.5 A). b, Length distribution diagram of the zigzag and 
armchair edges for ten ice islands (n= 249). Inset, statistics on the length of the 
zigzag and armchair edges as a fraction of the total length of all counted edges. 
c, Constant-current STM image acquired at the set point, 1OO mV and10 pA. 

d, Constant-height AFM (Af) image of the same areaasinc, recorded atatip 
height of 10 pm. The zigzag and armchair edges are denoted by green and red 
dashed lines, respectively. The tip height is referenced to the STM set point on 
the bilayer ice (100 mV, 50 pA), and the oscillation amplitude is 100 pm. 
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Fig. 2| Detailed AFM characterization of the 2D bilayer ice and the 
corresponding structural model. a, Constant-height AFM (Af) imaging at tip 
heights of 20 pm (left), 0 pm (middle) and -10 pm (right). b, Simulated AFM 
images at tip heights of 14 A (left), 13.7 A (middle) and 13.5 A (right). The J3 x /3 
unit cell is indicated by the dashed red rhombus. The O-H directionality of the 
water molecules is highlighted by the solid red lines. c, Top and side views of the 
bilayer ice structure on the Au(111) surface. Au, Hand O atoms inthe top water 
layer are denoted as yellow, white and red spheres, respectively. H and O atoms 
inthe bottom water layer are shown by blue spheres (witha smaller size for 


is highlighted in Fig. 2a (left panel). At a smaller tip height, the bright 
features of this sub-lattice start to show directionality, and the other 
sub-lattice resolves into a V-shaped feature (see the red lines in Fig. 2a, 
middle panel). When the tip height is further decreased to enter into 
the Pauli repulsion-force region, the AFM image shows a honeycomb 
structure with sharp lines connecting the two sub-lattices, resembling 
the H bonds (Fig. 2a, right panel). 

Density functional theory (DFT) calculations reveal that the 2D ice 
grown onthe Au(111) surface corresponds to an interlocked bilayer ice 
structure (Fig. 2c) consisting of two flat hexagonal water layers (see 
Methods). The hexagons of the two sheets are in registry and the angle 
between water molecules in the plane is 120°. In each water layer, half 
of the water molecules are lying flat (parallel to the substrate), andthe 
other half are vertical (perpendicular to the substrate), with one O-H 
either upward or downward. The vertical water in one layer donates a 
H bond to the flat water in the other layer, leading to a fully saturated 
H-bonding structure. Although evidence for such a flat bilayer of hex- 
agonal ice has been observed previously on hydrophobic surfaces 
and under hydrophobic confinements” ”, its atomic structure has 
not been directly imaged. 

The AFM simulation using a quadrupole (d,,2) tip (Fig. 2b, Methods) 
based on the above model agrees well with the experimental results 
(Fig. 2a, Extended Data Fig. 3). The very similar height of the flat and 
vertical water molecules makes it very difficult to distinguish them in 
STM images. However, the flat and vertical water molecules show dis- 
tinctly different contrast in AFM images (Fig. 2a, b, left panel) because 
the higher-order electrostatic force is very sensitive to the orientation 
of the water molecules””*. We can additionally discern the O-H direc- 
tionality of the flat and vertical water via the interplay between the 
higher-order electrostatic forces and Pauli repulsion forces (Extended 
Data Fig. 3), as highlighted by the red lines in Fig. 2a, b (middle panel). 
At small tip heights, where the Pauli repulsion force is dominant, the 
sharp bond-like features represent ridges of the potential-energy land- 
scape experienced by the functionalized probe, mainly arising from 
the lateral relaxation of the CO tip induced by the Pauli repulsion force” 
(Fig. 2a, b, right panel). 

Figure 3a, b (step 1) displays magnified AFM images of the zigzag 
and armchair edges, respectively, revealing that the zigzag edge grows 


clarity). The flat and vertical water molecules in the top layer are denoted by the 
blue and black arrows, respectively. In the side view, only the water molecules 
along one zigzag direction are shown for aclearer view of the top-bottom 
water pairs. The tip heights in aare referenced to the STMset point onthe 
bilayer ice (100 mV, 50 pA). The tip heights in b are defined as the vertical 
distance between the apex atom of the metal tip and the outermost atom of the 
Au substrate. All the oscillation amplitudes of the experimental and simulated 
images are 100 pm and the image sizes are 1.25nm x1.25nm. 
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Fig. 3 | Proposed growing process for zigzag and armchair edges. 

a,b, Constant-height AFM images and the corresponding ball-and-stick models 
of the most stable (1) and metastable structures (2-4) of zigzag (a) and 
armchair (b) edges. The proposed growing process cycles through steps 1to 4. 
In the AFM images, each red arrow indicates the addition of one bilayer water 


under preservation of its original structure, but the armchair-edge 
growth involves edge reconstruction into a periodic structure of 
5756-type member rings—that is, where the edge structure periodi- 
cally repeats the sequence pentagon-heptagon-pentagon-hexagon. 
DFT calculations indicate that the unreconstructed zigzag edge and 
the 5756-type armchair edge are the most stable edges (Extended 
Data Fig. 4). The 5756-type armchair edge forms as a result of com- 
bined effects that minimize the number of unsaturated H bonds and 
reduce the strain energy (Extended Data Fig. 5). It is well known that 
the basal planes of hexagonal ice are usually terminated with zigzag 
edges and that armchair edges are absent because of the higher density 
of unsaturated H bonds. However, in lower-dimensional systems or 
under confinement, the armchair edge can lower its energy by proper 
reconstruction. 

After ice growth was stopped at 120 K, the sample was immediately 
cooled downto5K (see Methods) inan attempt to freeze metastable or 
intermediate edge structures and ensure relatively long lifetimes to allow 


1 Hz 


-4.10 Hz 


7 Hz 


-7.28 Hz 


pair, leading to the structure in the subsequent image. In the ball-and-stick 
models, the red balls and sticks represent the newly added bilayer water pairs, 
and those in blue represent the existing structures. The size of the images is 
3.2nm x 1.9 nm (a) and 3.7 nm x 2.2nm (b). 


STMand AFMimaging. Owing to the weakly perturbative character of the 
CO-functionalized tip”, we were able to identify metastable and inter- 
mediate structures and reconstruct the 2D ice-growing process (Fig. 3). 
For zigzag edges, we occasionally find individual pentagons attached 
to the straight edges and that these can line up to form an array witha 
periodicity of 2 x a;,. (where d;,. is the lattice constant of the 2D ice). 
We interpret this as indicating that the growth of the zigzag edges is 
initiated by the formation of a periodic array of pentagons (Fig. 3a, 
steps 1-3), which involves the addition of two water pairs for a pentagon 
(see red arrows). The pentagon array is then bridged to form a 56665- 
type structure (Fig. 3a, step 4) and eventually recovers the original zigzag 
edge by adding more water pairs. Interestingly, we can even capture the 
tip-induced growth of an individual pentagon (Extended Data Fig. 6). 
By contrast, the armchair edges do not exhibit this pentagon array 
structure and we instead frequently observe short 5656-type steps at 
the edge (Extended Data Fig. 2). The length of the 5656-type edges is 
considerably shorter than that of the 5756-type edges, presumably 


7S SS Sn] [WLW MLM a3 is Ww MWe Fig. 4| Molecular-dynamics simulation of the 
Mine taee  (ee ee ee ee) (ee) |e aes growth of the zigzag and armchair edges. 
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a simulations during the growth of the zigzag (a) and 
armchair (b) edges. The simulation times are 
indicated inthe bottom right of each snapshot. Inall 
snapshots (upper panel, top view; lower panel, side 
view), the red and blue spheres represent the top- 
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layer and bottom-layer water molecules of a pre- 
existing bilayer ice grain, respectively. The green 
spheres represent newly deposited water molecules 
and formed structures during the simulated growth 
process. 


because the 5656-type edge is heavily stressed and is less stable than 
the 5756-type edge (Extended Data Fig. 5). Starting from the 5756-type 
armchair edge, the 575-type member rings are locally converted to 656- 
type member rings by the addition of two water pairs (Fig. 3b, step 2). 
The 656-type member rings then grow laterally to form a 5656-type 
edge (Fig. 3b, step 3) but with limited length, owing to the accumula- 
tion of strain energy. The strain can be partially relaxed by inserting 
one water pair into the hexagon of the 5656-type edge, leading again 
to the formation of a5756-type edge (Fig. 3b, step 4). Kinetically, sucha 
growth mechanism prohibits the formation of armchair edges as long 
as the zigzag edges (Fig. 1b). 

To further corroborate this proposed growth mechanism, molecular- 
dynamics simulations of water vapour ona Au(111) surface were carried 
out (see Methods). We found that 2D bilayer ice islands form on the 
surface, in agreement with our experimental observations (Extended 
Data Figs. 7, 8). The collective bridging mechanism at the zigzag edge 
is perfectly reproduced in Fig. 4a. It is worth noting that the single 
pentagon attached to the zigzag edge cannot act as a local nucleation 
centre to promote the growth (t=0.6-0.7 us in Fig. 4a, Supplementary 
Video 1). Instead, a periodic but unconnected array of pentagons is 
initially formed at the zigzag edge, and subsequent incoming water 
molecules collectively attempt to connect these pentagons, resulting in 
a565-chain structure (t= 2.2 usin Fig. 4a, Supplementary Video 2). Such 
a structure was not observed experimentally, owing to its short lifetime 
(Extended Data Fig. 9). The addition of one water pair further bridges 
the 565-type structure and the nearby pentagon, leading to the forma- 
tion of a5666-type structure (t= 2.4 p's; see Supplementary Videos 3,4). 
The 5666-type structure grows laterally to form a56665-type structure 
(t=2.6 us) and eventually turns into a fully connected hexagon array. 

As for the armchair edges, the local seeding growth can be clearly 
seen in Fig. 4b, agreeing nicely with the proposed mechanism from 
our experiments (Fig. 3b). The conversion from 575- to 656-type mem- 
ber rings starts from the bottom layer, forming a composite 575/656 
structure (t= 0.4 ps in Fig. 4b, Supplementary Videos 5, 6), whichis 
indistinguishable from the 5756-type edge in the experiments, because 
only the top layer of the 2D bilayer ice can be imaged. The resulting 
656 step then serves as the nucleation centre to grow the 5656-type 
edge (t= 0.6-1 1s, Supplementary Video 7). The addition of one water 
molecule into the 5656-type edge results in a highly mobile unpaired- 
molecule structure (Supplementary Video 8). Two of those unpaired 
water molecules can subsequently coalesce into a more stable hep- 
tagon structure, completing the 5656-to-5756 conversion (t=1.2 ps, 
Supplementary Video 9). 

We believe that the observed growth behaviour is a generic phe- 
nomenon for 2D ice, given that the relative stability of the different 
edge structures shows negligible dependence on the water spacing 
and the commensurability with the substrate (Extended Data Fig. 10). 
Indeed, bilayer hexagonal ice forms on different hydrophobic sur- 
faces*””"8 and under hydrophobic confinement”, and can be viewed 
as a stand-alone 2D crystal (2D ice 1), the formation of which is insensi- 
tive to the underlying structure of the substrate”’. Although it would 
be exceedingly difficult to extend our imaging method to observe 
three-dimensional (3D) ice growth®”°, the growth mechanism that we 
have uncovered might also occur at the surface of bilayer hexagonal 
ice, because it lacks dangling H bonds on its surface and might there- 
fore support bilayer-on-bilayer ice growth and ultimately a 2D-to-3D 
ice transformation. 
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Methods 


STM and AFM experiments 

All the experiments were performed with a combined noncontact 
AFM/STM system (Createc) at 5 K using a qPlus sensor equipped with 
atungsten tip (parameters: spring constant, kK, ~1,800 Nm; resonance 
frequency, fo = 26.7 kHz; quality factor, Q = 45,000). Ultrapure H,O 
(deuterium-depleted, <1 ppm; Sigma Aldrich) was used and further 
purified under vacuum by 3-5 freeze-and-pump cycles to remove 
remaining gas impurities. Then H,O molecules were dosed in situ onto 
a clean Au(111) surface held at 120 K through a dosing tube. The as- 
grown sample was first checked by STM at 77 K, and then quickly cooled 
downto 5K for further STM and AFM measurements. Throughout the 
experiments, bias voltage refers to the sample voltage with respect 
to the tip. The STM topographic images and AFM frequency-shift (Af) 
images were obtained with the CO-terminated tips in constant-current 
and constant-height modes, respectively. The CO tip was obtained by 
positioning the tip over a CO molecule on the Au(111) surface at a set 
point of 100 mV and 30 pA, followed by increasing the bias voltage to 
300 mV. The oscillation amplitude of experimental AFM imaging is 
100 pmif not specifically mentioned. 


DFT calculations 

DFT calculations were performed using the Vienna Ab initio Simu- 
lation Package (VASP version 5.3)”"*. Projector-augmented wave 
pseudopotentials were used with a cutoff energy of 550 eV for the 
expansion of the electronic wave functions®’. Van der Waals correc- 
tions for dispersion forces were considered using the ‘optB86b-vdW’ 
functional***. In the DFT calculations, the system consisted of the 
hexagonal 2D bilayer ice on top of a Au(111) substrate modelled by a 
four-layer slab. The lattice constant for Au was set to be 4.078 A and 
the bottom three-layer Au substrate was fixed in the DFT calculations. 
Monkhorst-Pack k-point meshes of spacing denser than 21 x 0.058 A7 
were used and the thickness of the vacuum slab was larger than 13 A. 
The geometry optimizations were performed with a force criterion 
of 0.01eVA7. 


Simulations of AFM images 

The Afimages were simulated with a molecular-mechanics model based 
on methods described previously*®”°. We used the following param- 
eters for the probe-particle-tip model: effective lateral stiffness, 
k=0.75 Nm; atomic radius, R,=1.661A. A quadrupole-like (d,2) charge 
distribution at the tip apex was used to simulate the CO tip” with 
q=-0.25e.d,2 represents the atomic-orbital function used to simulate 
the spatial distribution of charge density at the tip apex, dis the atomic 
orbit, zis the orientation of the orbit, eis the elementary charge and q 
is the magnitude of the quadrupole charge at the tip apex. The elec- 
trostatic potentials of the ice on Au(111) used in the AFM simulations 
were obtained from DFT calculations. The Lennard-Jones parameters 
for O and H atoms in the AFM simulation were: r, = 1.487 A, 
£4, = 0.680 meV, r, = 1.661 Aand £5 = 9.106 meV. 


Molecular-dynamics simulations 

We performed large-scale molecular-dynamics simulations and used 
the monoatomic model for water-water interactions”, which consists 
of short-ranged two-body and three-body non-bonding potentials 
without explicitly including hydrogen atoms*””’, The 12-6 Lennard-Jones 
potential was used for the interaction between water and the Au atoms 
of the Au(111) surface. The Lennard-Jones parameters were determined 
to be Eqy-war = 1.553 kJ mol and Gqy-war= 3.2. A to match the experimentally 
measured contact angles for a water droplet on the Au(111) surface’. 
Acutoff of 10 A was used for the Lennard-Jones potential. The velocity 
Verlet algorithm was used to integrate the equations of the motion 
with a time step of 2 fs. Periodic boundary conditions were applied 
in all three directions of the simulation box. All molecular-dynamics 


simulations were carried out using the Large-scale Atomic/Molecular 
Massively Parallel Simulator (LAMMPS) package*’. 

We performed deposition simulations of water vapour onthe Au(111) 
surface at 120 K. The deposition was initiated ona bare Au sheet with an 
area of 155.72 A x 159.832 A consisting of three atomic layers. The simula- 
tions were performed ina constant-volume and constant-temperature 
(NVT) ensemble. The temperature was controlled by a Langevin ther- 
mostat with a relaxation time of 1 ps. The Au sheet was kept rigid during 
the molecular-dynamics simulations, and the water molecules—initially 
located 20-25 A above the Au surface—were given initial velocities with 
arandom magnitude from 5.0 to10.0 A ps “inthe direction towards the 
Au surface. First, we introduced one water molecule to the simulation 
cell every 0.3 ns. Next, more detailed molecular-dynamics simulations 
were performed to explore the growth behaviour of the bilayer ice at 
the zigzag and armchair edges, after the larger-sized bilayer ice grains 
were formed. For these more detailed simulations, one water molecule 
was introduced to the simulation cell every 100 ns at either the zig- 
zag edge or the armchair edge. The water molecules were placed at a 
random initial location with a distance of 3 A from the nearest water 
molecule at the edge. 


The mechanism of submolecular-resolution AFM imaging 
Inrefs.”’, the electrostatic forces of the individual water clusters give 
rise to dark features at the position of H atoms at large tip heights. 
However, for 2D bilayer ice at large tip heights, the long-range attractive 
van der Waals background of the extended water network (Extended 
Data Fig. 3c, green line) smears out the dark contrast of the H atoms 
(Extended Data Fig. 3a, z, in Extended Data Fig. 3c). Instead, we found 
that the O-H directionality imaging of the 2D ice can be achieved 
at smaller tip heights (z, in Extended Data Fig. 3c), where the Pauli 
repulsive forces start to set in, such that the total force signals of the 
water molecules are separated out from the van der Waals background 
(Extended Data Fig. 3b). Such an imaging mechanism relies on the deli- 
cate interplay between the higher-order electrostatic forces and Pauli 
repulsion forces. The contribution from the electrostatic force of Hand 
O atoms (Extended Data Fig. 3g) can spatially modulate the Afcontrast 
of the Pauli repulsions, leading to the apparent O-H directionality. 
Toconfirm the role of the higher-order electrostatic force inthe AFM 
images, we performed systematic AFM image simulations for the 2D 
bilayer ice using quadrupole (d ,2) (Extended Data Fig. 3d) and neutral 
(Extended Data Fig. 3e) tip apexes at different tip heights. The O atoms 
of the flat water molecules are about 1-2 pm higher than those of the 
vertical water molecules. At a large tip height, the vertical water mol- 
ecules exhibit brighter contrast than the flat water molecules with the 
d,2 tip (Extended Data Fig. 3d, left), and the brighter features corre- 
spond tothe flat molecules for neutral tip (Extended Data Fig. 3e, left). 
When the tip height was set to an intermediate value at which the 
higher-order electrostatic and Pauli repulsion forces are comparable, 
the O-H directionality of the water molecule is evident in the simulated 
AFM images with thed ,2 tip (Extended Data Fig. 3d, middle). However, 
such submolecular features are much less obvious when using the 
neutral tip (Extended Data Fig. 3e, middle). Therefore, the inclusion 
of the higher-order electrostatic force (d,2tip) is essential to reproduce 
the experimental AFM contrasts (Extended Data Fig. 3b). At asmaller 
tip height, where the AFM signals are dominated by the Pauli repulsion”, 
the simulated AFM images show the same honeycomb structure of the 
2D ice for both the neutral andd 2 tips (Extended Data Fig. 3d, e, right). 
To further justify the importance of the d,2 tip in reproducing the 
experimental results, we compare the experimental and simulated 
force curves in Extended Data Fig. 3i-k. We note that the dip in the 
experimental force curve (F,) of the flat water molecules is deeper than 
that of the vertical water molecules (Extended Data Fig. 3i), which can- 
not be explained by the simple picture based on the height difference 
of the flat and vertical water molecules. Instead, sucha difference can 
be attributed to the fact that the negatively charged tip apex gains a 


larger (or smaller) attractive (or repulsive) electrostatic force above 
the flat water molecules than that above the vertical molecules (see 
Extended Data Fig. 3g, j). By contrast, the neutral tip yields negligible 
difference in F, curve at the dip position (Extended Data Fig. 3k). In 
addition, we found a crossover behaviour at small tip heights where 
the Pauli repulsion force is dominant (black ellipse in Extended Data 
Fig. 3i), which is also reproduced nicely by the d,,2 tip (black ellipse in 
Extended Data Fig. 3j) but is absent when the neutral tip is used 
(Extended Data Fig. 3k). This crossover behaviour results from the 
strong deflection of the CO tip by the Pauli repulsion force (see the red 
and blue arrows in Extended Data Fig. 31). The relaxation of the CO 
molecule occurs earlier at the vertical water molecules than at the flat 
molecules, primarily arising from the different shapes of the potential 
surface (Extended Data Fig. 3h), where the potential distribution above 
the vertical water molecules appears to be more anisotropic than above 
the flat water molecules. 


DFT-calculated formation energies of different edges of the 2D ice 
To compare the relative stability of the zigzag and armchair edges, 
edge-formation energies (£;) were calculated using DFT, which revealed 
that the unreconstructed zigzag edge and reconstructed 5756-type 
armchair edge are the most stable edges. The edge-formation energy! 
is defined as 


Ep= (ne 7 Eva - Exged/l 


where n, is the number of the water molecules in edged bilayer ice, / 
(in nanometres) is the length along the ice edge, and F4 ; and Ege, 
defined in Eqs. (1a) and (1b) below, are the adsorption energy (per water 
molecule) of the infinite 2D bilayer ice on the Au substrate and the 
adsorption energy of the edged 2D bilayer ice on the Au substrate, 
respectively. 


adi= (E[Au] + 1; x E[(H,0),] — Elice/Aul)/n; (1a) 


Ende = E[Au] +n, x E[(H20)g,] - Elice,/Aul (1b) 
where n; is the number of the water molecules in the infinite ice, 
E[Aul] is the energy of the bare Au substrate, E[(H,O),] is the energy 
of the isolated water molecule in the gas phase, and E[ice,/Au] and 
E[ice,/Au] are the total energies of the Au-supported infinite and edged 
2D ices, respectively. 

There are three different orientations for zigzag edges (ZZ1, ZZ2 and 
ZZ3) and armchair edges (ACI, AC2, and AC3), given a specific type of 
proton ordering (Extended Data Fig. 4a). ZZ1 and ZZ3 are equivalent, 
as are AC1 and AC3. Each orientation can produce two types of proton 
order along the edge. Experimentally, it is difficult to discern the O-H 
directionality at the edges because the vertical relaxation of the water 
molecules at the edges can easily smear out the weak-force contrasts 
arising from the O-H directionality. However, we could determine 
that the dangling OH is disfavoured at the edge of the top water layer. 

We thus only performed calculations of the non-equivalent orienta- 
tions for zigzag edges (Extended Data Fig. 4b, c) and armchair edges 
(Extended Data Fig. 4d, e) without or with fewer dangling OHs. In our 
calculations, one edge of the bilayer ice (orange O atoms in Extended 
Data Fig. 4b-e), was fixed at the same position of the infinite bilayer ice. 
Therefore, the relative formation energies of the other edge, AF;, canbe 
calculated after structural relaxation. Extended Data Fig. 4f shows AF; 
with respect to the corresponding unreconstructed 6666-type zigzag 
and armchair edges, where the unreconstructed zigzag edge and 5756- 
type armchair edge are the most stable edges no matter which type of 
edge is considered. We note that the 6666-type armchair edge cannot 
be seen inthe experiment, although the energy of the 6666-type edge 
is smaller than that of the 5656-type edge. This is due to the existence 
of astable composite 575/656 structure (t= 0.4 ps in Fig. 4b), which 


considerably lowers the 5756-to-5656 conversion barrier (see Extended 
Data Fig. 9). Therefore, the growth of armchair edges is governed by 
the interplay between the thermodynamics and kinetics, leading to the 
5756-to-5656 conversion in the absence of a 6666-type edge. 


Insight into the stability of the zigzag and armchair edges 

To gain further insight into the formation energies of different edges, 
we decomposed the DFT-calculated formation energy F; into three 
parts: the energy difference between the edged state and infinite state 
of the Au(111) substrate, F; ,,, the ice, F;;..,and the interaction between 
the Au(111) substrate and the ice, Ey au-icee We found that EF; 4, is negligi- 
ble, and thus the only noticeable contributions to F, are from F;;,. and 
E;au-icee Phe detailed relative energies (AE) with respect to the energy 
of the corresponding unreconstructed 6666-type edge are shown in 
Extended Data Fig. 5a, b, where the cyan, blue and red bars represent 
AE au-icer AE fice and AE;, respectively. In particular, we found that AF; ;..is 
the dominant component of AF,, which largely determines the relative 
stability of different ice edges. 

The three parts of the formation energy £; are defined as 


Et au= (E[Au.] ~ E[Auj])/L (2) 
Er ice = (Elice,] — n, x Elice,]/n,)/L (3) 
EF au-ice ai (Elice,/Au] “ Elice.] a E[Aug] Ne Xx Eu-ice)/t (4) 


The F4y-ice is the binding energy (per water molecule) between the 
Au(111) substrate and the infinite 2D ice, defined in Eq. (5) 


E’y-ice = (Elice,/Au] — Elice;] - E[Au,])/n; (5) 


where F[Au, ] and E[ice,] are the energies of the Au substrate and the ice 
separated from the Au-supported edged ice, respectively; F[Au,] and 
E{ice,] are the energies of the Au substrate and the ice separated from 
the Au-supported infinite ice, respectively. 

To explore the reason why the armchair edge is reconstructed to the 
5756-type edge, we analysed some details of H bonds at different arm- 
chair edgesin DFT calculation. AE;;,.is mainly related to the H-bonding 
interaction between the water molecules at the ice edge. We note on 
one hand that the density of unsaturated H bonds at the 5756-type 
armchair edge is reduced from that of the unreconstructed 6666-type 
(1.15/dice) to 0.87/djc¢, which can greatly lower the formation energy of 
the armchair edge. On the other hand, the formation of the 5756-type 
armchair edge introduces only a very small strain on the H bonds, as 
suggested by the small deviation of H-bonding length and angles from 
the unreconstructed 6666-type (Extended Data Fig. 5c). Therefore, 
the 5756-type armchair edge should be energetically favoured over 
the unreconstructed 6666-type. 

Although the 5656-type edge has an even smaller density of unsat- 
urated H bonds (0.58/q;,.) than does the 5756-type edge, it is much 
more stressed (Extended Data Fig. 5c) and becomes less stable than 
the 5756-type edge. Indeed, we found by experiment that the length 
of the 5656-type edges is primarily below 10 A, which is considerably 
shorter than that of the 5756-type edges (Extended Data Fig. 5d). Such 
a difference can be explained by considering that the 5656-type edge 
cannot grow too long, owing to the accumulation of strain energy. 
Therefore, the stabilization of the 5756-type armchair edge results 
from the combined effects of minimizing the unsaturated H bonds 
and reducing the strain energy. 


Stability of various intermediate structures at the edges 
obtained by molecular-dynamics simulations 

We note that some intermediate structures in molecular-dynamics 
simulations shown in Fig. 4 cannot be observed in experiments (Fig. 3). 
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This is related to the relative stability and lifetime of the various inter- 
mediate structures. Owing to computational limitations, it is very dif- 
ficult to obtain the accurate lifetimes for the intermediate structures, 
whichare relatively long compared to the simulation time. Instead, we 
have calculated the interacting energy (AF,) between a water molecule 
at the edge and the remaining water molecules together with the Au 
substrate after optimization for various intermediate structures by 
classical force field. The maximum interacting energy corresponds to 
that needed to decompose the existing structure during the growth, 
thus providing an estimation for the lifetime. 

As shown in Extended Data Fig. 9, our calculations show that the maxi- 
mum interacting energy between a water molecule at the zigzag edge and 
the remaining water molecules follows ZZ3 > ZZ2 > ZZ4 >ZZ5>ZZ6>ZZ1. 
Such atrend suggests that individual pentagon structures attached at 
the zigzag edge (ZZ1) are the most stable. By contrast, zigzag-565 (ZZ4) 
should have the shortest lifetime among the intermediate structures with 
paired water, which explains why such a structure cannot be observed 
experimentally. In addition, we note that the lifetime of the 5(6---6)5 
structure at the zigzag edge increases with the number of the hexagons. 

Forarmchair edge structures, the maximum interacting energy follows 
AC4>AC3>AC2>ACL1. Interestingly, it was revealed that the composite 
575/656 structure (ACI) is very stable. However, we cannot distinguish 
between the 575/656 structure and the 5756-type edge in experiment, 
because only the top layer of the 2D bilayer ice can be imaged by STM 
and AFM. Such a composite 575/656 structure would greatly facilitate 
the 5756-to-5656 conversion during the growth of the armchair edge 
structure. Furthermore, the lifetime of the 5656-type edge decreases 
rapidly as its length increases, which is consistent with experimental 
results that indicate that the observed 5656-type edges are mostly short. 


Data availability 


The source data are available from the corresponding authors upon 
reasonable request. 
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reasonable request. 
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Extended Data Fig. 1| Experimental evidence for the bilayer nature of 2D ice. 


a,d, STMimages of a bilayer ice island (a) and cluster (b). Set point, 10O mV and 
10 pA.b, e, AFMimages of the same ice island (b) and cluster (e). b was acquired 
at the constant-current mode with set point 100 mV and 50 pA. ewas recorded 
ata constant height of 280 pm, referenced to the set point of 1OO mV and50 pA 
onthe Au(111) substrate. c, Height-distribution diagram within the red dashed 
rectangular areaina. The red arrow denotes the bottom layer of the bilayer ice, 
proving the bilayer nature of the 2D ice. f, Height profile across the red line 
shownind, giving two different steps with heights of about 150 pmand about 
250 pm, consistent with the results of the 2D ice island. g, False-colour STM 


herringbone 


Au(111) 


hep fee Au (111) 


image of a2D iceisland grown ona Au(111) surface, where the honeycomb 
structure of the 2D ice and the herringbone reconstruction of the Au(111) 
surface are distinguishable. The atomically resolved STM images of the Au(111) 
lattice are superimposed within the face-centred cubic (fec) and hexagonal 
close-packed (hcp) regions, showing good registry between the 2D ice andthe 
Au substrate. The set points are 100 mV and 10 pAand5 mV and 6nA for the 

ice island and the Au(111) lattice, respectively. The white dashed grids 
correspond tothe 1 x 1lattice of Au(111) within the fcc and hcp regions. The 
inset at the upper-right corner is a composite 2D-FFT image of the Au(111) and 
2D-ice lattice, and shows the corresponding 1 1and 3 x J3 periodicities. 
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Extended Data Fig. 2 | Interruption of the armchair edges by defects and 
kinks. a-e, Constant-height AFM images of edge areas that contain short 
reconstructed armchair edges. The tip height is Z¢r,e.=—10 pm, referenced to 
the STM set point 100 mV and 50 pA onthe water molecules of the second layer 
of bilayer ice. The red and green lines represent the armchair and zigzag edges, 
respectively. The red, green and yellow arrows point to three types of kinks at 
the armchair edges. Type-1 (red) and type-2 (yellow) kinks correspond to the 
cases where the armchair edges are terminated at the hexagons and pentagons, 
respectively. The local seeding growth model requires individual nucleation 
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centres to facilitate the growth of the armchair edges, naturally leading to 
these step-like structures. f, Schematic showing the formation of type-3 (green 
arrows) kink defects, consisting of 647-type member rings. These defects are 
formed owing to the position of the heptagons at the armchair edges, which 
leads to two different structure series. The green shaded areas represent 
5657-member-ring series, and the unshaded areas represent the 5756-type 
member ring series. The joint of the two different series results in atype-3 
defect, which could further develop into a trapped 7-type member ring in the 
second-outermost layer, as indicated by ared circleine. 
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Extended Data Fig. 3 | The mechanism of submolecular-resolution AFM 
imaging. a,b, Experimental AFM frequency-shift (Af) images obtained at tip 
heights and oscillation amplitudes of 70 pm and 40 pm (a) and 0 pm and 

100 pm (b).c, Afcurves (oscillation amplitude, 40 pm) above a vertical water 
molecule (vertical), a flat molecule (flat) and the hollow site of hexagonal ice 
lattice (denoted as background, bkgd) asa function of the tip height. z, and z, 
denote the tip heights of the two Afimages ina and b, respectively. d,e, 
Simulated Afimages at different tip heights z (given above each image) 
obtained with quadrupole (d ,2,q=-0.25e; d) and neutral (q=0;e) tips. f, Top 
view of the 2D bilayer ice structure (top layer) onthe Au(111) substrate. The 
bottom ice layer is hidden to highlight the structure of the top layer. The green 
and red dashed parallelograms in d-f denote the sub-lattices of the vertical and 
flat water molecules, respectively. g, Calculated electrostatic potential map of 
the bilayer ice on the Au(111) ina plane 7.24 A above the highest atom inthe Au 
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substrate. h, Simulated total potential map of the bilayer ice on Au(111) ina 
plane, corresponding to the position of the CO-tip apex at a tip height of 12.5A. 
i-k, Vertical force above the flat (F,_-) and vertical (F,_,) water molecule asa 
function of tip height. i, Experimental F, obtained by integrating the 
experimental Af(z) inc according to ref. ”. Before the integration, Af(z) was 
smoothed using a moving average filter with a span of 5.j,k, Simulated 
F,computed withd ,2 (j) and neutral (k) tips. I, Simulated lateral deflection of 
the quadrupole probe particle in the x direction (X,_4) asa function of the tip 
height. X,-4.,andX,_4+correspond to X,_,above the vertical water molecule and 
the flat water molecule, respectively. Tip-height references are the same as 
those in Fig. 2.Ingandh, Hand O atoms inthe top-layer ice are denoted as white 
and red spheres, respectively. The image sizes ina, band d-hare 
1.25nmx1.25nm. See Methods for details. 
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Extended Data Fig. 4| DFT-calculated formation energies of different edges 
of the 2D ice. a—e,Top view of the top layer of bulk (a), zigzag (ZZ)-edged (b,c), 
calculations: 2.00 nm x 3.46 nm (b,c) and 1.73 nm x 3.50 nm (d, e). f, The relative 


the top layer. Image sizes: 6.52 nm x 2.17 nm (a), 2.00 nm x 2.61nm (b,c), and 
1.73 nm x 2.61nm (d, e). Lateral size of the supercell used inthe DFT 


and armchair (AC)-edged (d, e) 2D ices ona Au(111) substrate. The three 
different zigzag and armchair edge type are denoted ina by solid and dashed 
poly lines, respectively. The fixed edges during the structural relaxation are 
marked in orange. The bottom ice layer is hidden, to highlight the structure of 


formation energy (AE£,) of the different edge types. See Methods for details. 
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Extended Data Fig. 5| Insight into the stability of the zigzag and armchair and the bilayer ice, the isolated bilayer ice, and the Au-supported bilayer ice, 
edges.a,b, Decomposed DFT-calculated relative formation energies of the 2D respectively. c, The average O-O distance* (d,,) and H-bonding angle? 
bilayer ice with different edge types (ZZ1 and ACI, a; ZZ2 and AC2, b). The (O-H...O angle) of the outermost rings of different armchair edges. 
relative formation energies of different edges are referenced to that of the d, Experimental length distribution diagram of 5656- and 5756-type armchair 
corresponding unreconstructed 6666-type edge. The cyan, blue and red bars edges for teniceislands, n=122. Inset, Statistics onthe total length of 


represent the relative energy of the interaction between the Au(111) substrate corresponding edges. See Methods for details. 
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Extended Data Fig. 6 | Tip-induced growth of the pentagon structure at the 
zigzag edge. a, b, AFM images of the same area during the consecutive 
scanning showing the formation of the pentagon structure. Tip height, 
Zoffset= 10 pm, referenced to the STM set point of 1OO mV and 50 pA onthe 
water molecule of the bilayer ice. c,d, The corresponding snapshots in the 
molecular-dynamics simulations. The dangling-like water molecule 


EPL PN 10.3 Hz 


corresponds tothe molecule attached to the top layer (see the black arrows 
inaandc), and the water molecule located at the middle of the bilayer ice has an 
apparently shorter bond (grey arrows inaandc). As highlighted by the red 
dashed circles inaandb, during close imaging at a very small tip height, a 
complete pentagon structure at the zigzag edge can be formed, induced by the 
perturbation of the tip. 
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Extended Data Fig. 7 | Molecular-dynamics simulation of 2D ice formation 
and armchair edge stability. a, Top (upper) and side views (lower) ofa 
snapshot show 1,394 water molecules deposited ona Au(111) surface at 120 K. 
The bottom layer of water molecules is shown in blue and the top layer in red. 
Au atoms of the Au surface are shown in black. No good registry between the 2D 
ice and the Au substrate is found, probably due to the weak interaction between 
them. Although 5656-type armchair edges appear, the 5756-type and 
6666-type armchair edges are absent, because of the coincident number of the 
water molecules added and the limited length of the edges. b, Transverse 
density profile of the 2D bilayer ice. The intensity of the lower peak is slightly 
larger than that of the higher one, indicating that the growth of bilayer ice 


starts from the bottom layer. c, Snapshot of a bilayer ice ribbon (20.76 nmin 
length) ona Au(111) surface after relaxation for 20 ns, originally with two 
armchair edges of 5656-type (upper) and 6666-type (lower). Some 5656-type 
structures spontaneously convert to 5756-type structures (highlighted by blue 
ellipses) during the simulation, indicating that the 5756-type edge should be 
thermodynamically more stable than the 5656-type edge. d, Snapshot at t=1p1s 
after 63 water molecules were introduced to 6666-type armchair edges. Most 
of 6666-type structures change to 5756-type or 5656-type structures, 
suggesting that the growth of armchair edges is governed by the 5756-to-5656 
conversion in the absence of a6666-type edge. 
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Extended Data Fig. 8 | Nucleation of the 2D ice onthe Au surface. a, Top 
(upper) and side views (lower) of consecutive snapshots show 8, 10, 11, 14, 41, 43, 
100 and 256 water molecules deposited ona Au(111) surface at 120 K. The2D 
bilayer ice structure was gradually formed through single-layer and double- 
layer liquid clusters. b, Top (upper) and side views (lower) of snapshots at times 
t=O and 23.5 ps after the deposited water molecule (green ball) arrived at the 
Au surface. c, Top (upper) and side views (lower) of snapshots at times t= 0 and 


Ops 787 ps 


787 ps after the deposited water molecule (green ball) arrived on the surface of 
bilayer ice. The bottom layer of water molecules is shown in blue and the top 
layer in red, and the Au atoms of the Au surface are shown in black. The water 
molecule landing on the Au or ice-island surface moves around until it finds its 
way to attach to the edge of the ice, without creating any new nucleation 
centres. 
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Extended Data Fig. 9 | Stability of various intermediate structures at the 
zigzag and armchair edges obtained by molecular-dynamics simulations. 
a,b, Molecular-dynamics simulations snapshots of various intermediate 
structures during the growth of the zigzag (a) and armchair (b) edges. One 
water molecule was introduced to the simulation cell every 100 ns. The 
representative water molecules with low coordination at the edges are marked 
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by numbers. c, The calculated interacting energy (AE£,) for the different 
intermediate structures showninaandb. AE, is defined as the interacting 
energy between a specific water molecule and the remaining water molecules 
together with the Au atoms in substrate after optimization. The maximum 
energy values are indicated in red. See Methods for details. 
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Extended Data Fig. 10 | The influence of water spacing on the stability of 
different edges. a,b, The DFT-calculated edge-formation energies asa 
function of water spacing (d,,) for free-standing 2D ice with different proton 
ordering (AC1and AC2, see Extended Data Fig. 4a for detailed definitions). The 
2D ice with minimum energy has a water spacing of 2.706 A. The relative 
stability of the different armchair edges remains unchanged with water 
spacing from 2.706 A to 2.884 A; the 5756-type armchair edge is the most stable 
edge. The abscissa reflects the cell size in the direction parallel to the edge, 
whichis crucial, owing to the periodic boundary conditions in the calculation. 
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d, corresponds to the nearest water-water spacing along the direction parallel 
to the edge. F; represents the edge formation energy, similar to that defined in 
Methods section ‘DFT-calculated formation energies of different edges of the 
2Dice’. Allatoms in the ice edge were fully relaxed and the structures of the 
different ice edges are similar as those in Extended Data Fig. 4.c, The relative 
formation energy (AE,) of different edges calculated by classical force field, 
which follows 5656 > 6666 > 5756 for all cases, regardless of the water spacing 
and the commensurability with the substrate. 
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The size-dependent and shape-dependent characteristics that distinguish nanoscale 
materials from bulk solids arise from constraining the dimensionality of an inorganic 
structure’ °, As a consequence, many studies have focused on rationally shaping these 
materials to influence and enhance their optical, electronic, magnetic and catalytic 
properties* °. Although a select number of stable clusters can typically be synthesized 
within the nanoscale regime for a specific composition, isolating clusters of a 
predetermined size and shape remains a challenge, especially for those derived from 
two-dimensional materials. Here we realize a multidentate coordination environment 
ina metal-organic framework to stabilize discrete inorganic clusters within a porous 
crystalline support. We show confined growth of atomically defined nickel(I1) 
bromide, nickel(11) chloride, cobalt(11) chloride and iron(II) chloride sheets through 
the peripheral coordination of six chelating bipyridine linkers. Notably, confinement 
within the framework defines the structure and composition of these sheets and 
facilitates their precise characterization by crystallography. Each metal(I1) halide 
sheet represents a fragment excised from a single layer of the bulk solid structure, and 
structures obtained at different precursor loadings enable observation of successive 
stages of sheet assembly. Finally, the isolated sheets exhibit magnetic behaviours 
distinct from those of the bulk metal halides, including the isolation of 
ferromagnetically coupled large-spin ground states through the elimination of long- 
range, interlayer magnetic ordering. Overall, these results demonstrate that the pore 
environment of a metal-organic framework can be designed to afford precise control 


over the size, structure and spatial arrangement of inorganic clusters. 


Several reports have demonstrated the uniform incorporation of nano- 
particles or clusters in metal-organic frameworks through encapsu- 
lation of preformed particles or serendipitous self-assembly during 
framework synthesis’ °. Constraining cluster formation within frame- 
work pores has proven tobe more difficult, as the absence of sufficiently 
stabilizing interactions in most metal-organic frameworks leads to 
nonselective agglomeration and unrestricted growth’®. Nonetheless, 
frameworks bearing coordinating groups have, in a few cases, been 
shown to encourage site-specific nucleation of clusters or nanoparti- 
cles", Although these methods afford some control over cluster size 
and distribution, correlating the properties of the resulting species to 
their atomic structure remains challenging. 

We proposed that pre-organization of the coordinating groups 
in a metal-organic framework could enable the templated growth 
of discrete inorganic clusters. Thus, we selected the framework 
Zr,O,(OH),(bpydc), (1) (Fig. la; where bpydc” = 2,2’-bipyridine-5,5’- 
dicarboxylate), which features roughly 1.3-nm-wide octahedral cages 
lined with chelating sites that readily bind a variety of metal sources as 
isolated, mononuclear complexes, including metal (11) halides? #16", 


Notably, metallation of the bipyridine linkers of this framework induces 
asingle-crystal-to-single-crystal transformation that results in crystal- 
lographic ordering of the metal-linker complexes’*”, thereby enabling 
their structure determination by crystallography. Once metallated, six 
bipyridine linkers point towards the centre of each octahedral cavity, 
providing nucleation sites and creating a multidentate scaffold for 
cluster growth. 

Reaction of 1 with Ni(DME)Br, (where DME = 1,2-dimethoxyethane) in 
bis(2-methoxyethyl) ether (diglyme) at 120 °C afforded 1(NiBr,)., and 
characterization of single crystals by X-ray diffraction at 100 K revealed 
the growth of isolated nickel(I1) bromide sheets within the octahedral 
cages of the framework (Fig. 1b and Supplementary Fig. 1). Coordination 
of six bipyridine linkers to edge nickel sites constrains the diameter of 
each sheet to about 1.5 nm, with the octahedral cage distorting slightly 
to accommodate the sheet dimensions. At full occupancy, each cluster 
represents a monolayer of 19 edge-sharing nickel octahedra that closely 
resembles a portion of a single layer within the structure of bulk NiBr,’®. 
Each cluster contains four crystallographically distinct nickel(II) sites. 
Two of these sites correspond to twelve nickel centres that define the 
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Zr,0,(OH),(bpydce),(NiBr,),. 


Fig. 1| Solid-state structures. a, b, Portions of the structures of 1(a) and 
1(NiBr,),;(b) at 100 K as determined by single-crystal X-ray diffraction. The four 
crystallographically distinct Ni(I1) sites in the structure of 1(NiBr,),; are labelled 


sheet edges, alternating between nickel centres bound by bipyridine 
(site I) and sites facing the tetrahedral cages of the framework (site 
II). The third and fourth sites form the sheet interior, comprising six 
symmetry-equivalent nickel octahedra (site III) surrounding a central 
nickel site (site IV). Nickel site occupancies in the 1(NiBr,),,. structure 
decrease from 78.4(1)% for site | and 43.2(9)% for site Il at the sheet edges 
to 39.9(9)% for site III and 23.3(17)% for site IV at the centre (values in 
parentheses correspond to standard uncertainties calculated from 
the crystallographic refinement). Overall, these occupancies amount 
to 52.2(5)% of the expected loading for a Ni, .Br3, cluster and suggest 
that complete sheets fill 23% of the framework cages and partial sheets 
take up 20%, and a combination of mononuclear bipyridine-NiBr, 
complexes and unmetallated linkers probably occupy the remaining 
cages. Optimizing the reaction conditions by lowering the concen- 
tration of coordinating solvent (as further discussed below) led toa 
higher overall Ni occupancy of 80.5(3)% in the structure of 1(NiBr,),5. 
The average nearest Ni---Ni separation (3.723(18) A) in this structure 
closely matches the separation in bulk NiBr, (3.723(10) A), further cor- 
roborating the similarity of these sheets to those in the bulk structure’. 

Bulk NiCl, adopts the same layered structure type as NiBr,, but 
exhibits contracted lattice dimensions as a result of having shorter 
nickel-halide bonds”. To probe whether 1 could also stabilize nickel(11) 
chloride sheets, the framework was treated with a solution of Ni(DME) 
Cl, in diglyme at 120 °C. The structure of the resulting framework 
1(NiCI,),; at 100 K (Fig. 2a and Supplementary Fig. 2) confirmed the 
formation of analogous nickel(I1) chloride sheets. Notably, the flex- 
ibility of the framework allows the bipyridine linkers to conform tothe 


with Roman numerals at the upper right of each site. Yellow, green, dark red, 
red, blue and grey spheres represent Zr, Ni, Br,O, Nand C atoms, respectively; H 
atoms are omitted for clarity. 


more compact nickel(11) chloride lattice. Consistent with its greater 
lattice stabilization energy”, nickel(11) chloride affords a higher crystal- 
lographic Niloading (69.9(4)%) compared with nickel(11) bromide under 
similar reaction conditions. Moreover, the nickel site occupancies were 
found to be 78.4(7)%, 67.1(7)%, 64.5(7)% and 68.4(16)% for sites I, II, Ill 
and IV, respectively, implying that 65% of the octahedral cages contain 
full sheets, whereas only 3% contain partial sheets. These results sug- 
gest that nickel(11) chloride preferentially forms complete clusters. 
Unlike the nickel(1I) bromide structure, 1(NiCI,),; features a slightly 
expanded average Ni---Ni separation of 3.578(17) A in 1(NiCl,),3 com- 
pared with 3.483(6) Ain the bulk structure”, which probably reflects a 
subtle interplay between the stabilization gained from forming an ideal 
NiCl, lattice and the strain incurred upon contraction of the bipyridine 
linkers of the framework around the cluster. 

Encouraged by the stabilization of nickel(11) halide clusters in 1, 
we pursued the extension of this chemistry to cobalt(II) chloride and 
iron(11) chloride; however, attempts under similar reaction conditions 
resulted in metallation of only the bipyridine sites. Recognizing that 
an equilibrium between the metal(II) halide clusters and solvated 
metal species governs sheet assembly, we conducted reactions under 
reduced concentrations of coordinating solvent to drive the equilib- 
rium towards sheet formation. Specifically, performing the reaction 
of single crystals of 1 with either CoCl, or FeCl, in a10% (v/v) solvent 
mixture of DME and 1,2-difluorobenzene (DFB) at 120 °C facilitates the 
growth of cobalt(II) and iron(II) chloride sheets in the framework to 
yield 1(CoCL,),, (Fig. 2b and Supplementary Fig. 3) and 1(FeCl,),, (Fig. 2c 
and Supplementary Fig. 4), respectively. Close inspection of the cobalt 
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Zr0,(OH),(bpyde),(NiCl,),5 


Fig. 2 | Solid-state structures. a—c, Portions of the structures of 1(NiCI,),; (a), 
1(CoCI;),4 (b), and 1(FeCl,),, (c) at 100 K as determined by single-crystal X-ray 
diffraction. The five crystallographically distinct metal sites in the structure of 


andiron structures revealed a fifth metal site (site V) in addition tothe 
four distinct octahedral sites present in the nickel(11) halide clusters. 
Complexes at site V cap the edges of each sheet at site II and probably 
represent a mixture of [MCI,]” and M(DME),Cl, (n=1 or 2) complexes; 
however, disorder of these species precluded unambiguous assignment 
of their identity. Site occupancies for the four octahedral sites in the 
cobalt and iron clusters were found to range from 60.5(12)% to 81.4(6)% 
for Co and from 74.6(10)% to 88.5(5)% for Fe (Supplementary Table 1), 
and the tetrahedral sites were generally found to be only one-third occu- 
pied. Both structures display relatively high metal loadings (74.7(4)% for 


NiCl, 


Zr,0,(OH),(bpyde),(CoCl,) 4 


Zr0,(OH),(bpydo),(FeCl,),, 


1(CoCI,),, are labelled with Roman numerals at the upper right of each site. 
Yellow, green, purple, orange, light green, red, blue and grey spheres represent 
Zr, Ni, Co, Fe, Cl,O, Nand C atoms, respectively; H atoms are omitted for clarity. 


Coand 91.6(3)% for Fe relative to a M,,Cl3, sheet), indicating that mini- 
mizing the amount of coordinating solvent strongly promotes sheet 
formation. As with the nickel(11) chloride structure, the two frameworks 
contain sheets with slightly longer average M---M separations between 
octahedral centres (Co--Co =3.65(2) Aand Fe---Fe = 3.680(12) A) com- 
pared to those in the bulk (Co---Co = 3.553 A and Fe-Fe = 3.603 A). 
Considering that sheet formation close to the crystal exterior may 
hinder further diffusion of metal(11) halide (MX,) units into the crystal, 
microcrystalline samples were analysed by scanning transmission elec- 
tron microscopy energy-dispersive X-ray spectroscopy (STEM-EDS) to 
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1.5 Excess 


NiX, equivalents per bipyridine 


Fig. 3 | Solid-state structures monitoring the growth of nickel(11) halide 
sheets. Stages of nickel(11) halide cluster growth based on 100 K single-crystal 
structures of Lafter reaction with 1.0, 1.5 and excess equivalents of NiBr, (top 
row) and NiCl, (bottom row) (Supplementary Tables 1 and 2). Inthe structures 
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obtained from the reactions with 1.5 equiv NiX,, sections of the clusters are 
faded to illustrate their lower occupancies. Yellow, green, dark red, light green, 
red, blue and grey spheres represent Zr, Ni, Br, Cl,O, Nand C atoms, 
respectively; H atoms are omitted for clarity. 
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Fig. 4 | Magnetic data. a—d, d.c. magnetic susceptibility data collected for 
1(NiBr,),, (a, red symbols), 1(NiCI,),;(b, green symbols), 1(CoCI,),. (c, purple 
symbols), 1(FeCl;),. (d, orange symbols), and their corresponding bulk metal(11) 
halides (grey symbols) under a1T or 0.01T applied field. emu, electromagnetic 
unit. e, Zero-field-cooled (filled orange circles) and field-cooled (empty orange 


obtain elemental maps and determine the extent of MX, penetration. 
These experiments reveal that the Mand X spatial distributions of each 
variant match well with that of Zr (Extended Data Figs. 1-4), suggesting 
uniform dispersion of sheets throughout the crystal, rather than their 
accumulation at regions near the crystal surface. Importantly, these 
results also indicate that an equilibrium exists between the clusters and 
dissolved metal species under the reaction conditions, enabling migra- 
tion of MX, species to the crystal interior and reversible sheet formation. 

The formation of partially filled sheet fragments in 1(NiBr,),. sug- 
gested that snapshots of sheet growth could be monitored asa function 
of metal halide loading, and towards this end single-crystal structures 
were determined at 100 K for samples of 1treated with increasing equiv- 
alents of NiBr, or Ni(DME)CI, relative to bipyridine (Fig. 3). Reaction 
with one equivalent of either metal source exclusively resulted in met- 
allation of the bipyridine linkers”, confirming that cluster nucleation 
occurs at these sites. For nickel(I1) bromide, additional equivalents 
populate the rest of the sites, preferring edge sites over those at the 
interior (Extended Data Fig. 5a). This trend implies that nickel(I1) bro- 
mide sheet formation initiates at the bipyridine sites, followed by a 
progressive inward growth towards the centre. In contrast, the remain- 
ing sites in the nickel(11) chloride sheets fill uniformly with increasing 
NiCI, loading (Extended Data Fig. 5b), further indicating the tendency 
of nickel(11) chloride to form completely filled sheets. 

Because the sheets represent fragments of the corresponding metal 
halide monolayers, we anticipated that their magnetic behaviour might 
be related to that of the bulk material. In the latter, ferromagnetic cou- 
pling is dominant within monolayers, and antiferromagnetic coupling 
occurs between adjacent layers”. Accordingly, the product of the 
molar magnetic susceptibility times temperature (x,,7) for each bulk 
material initially increases with decreasing temperature as the spins 
within each monolayer align ferromagnetically. Below the Néel tem- 
perature (7,) for each solid, a sharp decrease in x,,7 is then observed 


circles) magnetization per mole of 1(FeCl,),. (M) versus temperature (7) data 
taken under a0.01T applied field for 1(FeCl,),.. f, Magnetization per mole of 
1(FeCl,),. (M) versus applied d.c. magnetic field (H) data for 1(FeCl,),. (empty 
orange circles) collected at 2 K using a sweep rate of 9 mT s“. Solid lines are 
guides for the eye. ,, Bohr magneton. 


as alternating monolayers adopt opposite spin orientations to forman 
antiferromagnetic ground state”. Notably, the isolation of individual 
layer fragments within 1 presents an opportunity to eliminate the anti- 
ferromagnetic interlayer interactions and simultaneously confine the 
magnetic domains to the nanoscale. 

To compare the magnetic properties of the framework-confined clus- 
ters to those of the bulk solids, d.c. magnetic susceptibility data were 
collected on microcrystalline powder samples of 1(NiCI,),;, 1(NiBr,),,, 
1(FeCl,),. and 1(CoCI,),, between 300 and 2 K under applied fields of 
0.01, 0.1 and 1T. For each framework, the per-metal susceptibilities 
measured at room temperature are slightly lower than those observed 
in the corresponding bulk material, with the exception of 1(NiBr,),,, 
which exhibits a large temperature-independent paramagnetic con- 
tribution (Fig. 4a—d). In the case of 1(FeCl,),, and 1(CoClI,),,, it is likely 
that the tetrahedral metal sites contribute a lower magnetic moment 
than do the octahedral sites, thereby suppressing the per-metal sus- 
ceptibility. Analogous to the bulk metal halides, yy7 increases for all 
cluster-containing materials upon cooling below 300 K, indicative of 
ferromagnetic coupling of individual spins within each sheet fragment 
to formatotal spin S. Notably, y,,7 continues to increase well belowthe 
Néel temperature for each corresponding bulk material, consistent with 
the behaviour expected for isolated monolayers. A steep decrease in 
XqT is eventually observed below 10 K for all confined sheets, which we 
attribute primarily to Zeeman splitting of the high-spin ground state, 
and, inthe case of 1(FeCl,),., to magnetic blocking (as discussed below). 

To characterize further the static magnetic behaviour of the frame- 
work-confined sheets, we collected zero-field-cooled (ZFC) and field- 
cooled (FC) magnetization data for temperatures ranging from 2 to 
300 K. Whereas these data are completely superimposable for the 
cobalt(11) and nickel(11) materials and are indicative of simple para- 
magnetism at low temperatures, a divergence is observed for 1(FeCl,),, 
at around 3 K (Fig. 4e), suggesting the immobilization of the total spin 
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S along a magnetic easy direction and the onset of superparamagnet- 
ism. In support of this observation, variable-field magnetization data 
collected for 1(FeCl,),. with a sweep rate of 9 mT s revealed magnetic 
hysteresis at 2 K with a coercive field of H.= 70 mT (Fig. 4f). 

To determine the magnitude of the barrier to spin reversal for the 
confined sheets in 1(FeCl,),., we collected temperature-dependent 
a.c. magnetic susceptibility data under zero d.c. field and at discrete 
frequency values ranging from 1 to 1,000 Hz (Extended Data Fig. 6). 
A maximum was observed in both the in-phase (x’) and out-of-phase 
(x”) magnetic susceptibility data, with the peak shifting only 1.3 K 
over the measured frequency range. The frequency dependence in 
x” precludes the existence of long-range magnetic ordering, and this 
low-temperature behaviour can instead be attributed to either super- 
paramagnetism or a glassy magnetic phase transition. The magnitude 
of the frequency shift can be quantified using the Mydosh parameter 
(y), which adopts characteristic values for different magnetic behav- 
iours”, We find a Mydosh parameter of 0.14 for the FeCl, clusters, which 
is most consistent with superparamagnetism. An Arrhenius fitting of 
thea.c. susceptibility data affords physically meaningful values for the 
spin reversal barrier of U.= 16 cm™ and a relaxation attempt time of 
T)=10° s. Notably, these values are competitive with iron(1I) cluster- 
based single-molecule magnets”. 

As an additional probe of the magnetic behaviour of 1(FeCl,),o, 
Mossbauer spectra were collected on a microcrystalline sample at 
temperatures ranging from 5 to 295 K (Extended Data Fig. 7 and Sup- 
plementary Fig. 24). At all temperatures, the spectra exhibit iron(II) 
quadrupole doublets, consistent with the crystallographic environ- 
ments in1(FeCl,),, (Supplementary Fig. 25 and Supplementary Tables 4 
and 5), and the spectral fits between 10 and 295 K indicate that 1(FeCl,),, 
behaves as a paramagnet at these temperatures. Upon cooling from 
8 to 5K, asubstantial broadening is observed concomitant with the 
gradual appearance of a superparamagnetic sextet with an average 
iron(11) hyperfine field of 9.8(2) T. In conjunction with the a.c. mag- 
netic susceptibility data, these results support the observation that 
the framework-confined iron(11) chloride sheet fragments exhibit 
superparamagnetism below 8 K. 

As research into the electronic and magnetic properties of two- 
dimensional materials intensifies*”®, increased attention is being paid 
towards lateral confinement of monolayers to yield two-dimensional 
clusters or quantum dots”””’. Analogous to the confinement of three- 
dimensional materials, confinement of two-dimensional materials is 
anticipated to reveal distinct or enhanced physical and chemical prop- 
erties, including those associated with edge states””*°. Understanding 
and exploiting the structural influences on the properties of these 
two-dimensional clusters will therefore require chemical syntheses 
that yield monodisperse and well defined materials. 
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Methods 


General methods and materials 

All manipulations were performed under aN, atmosphere ina Vacuum 
Atmospheres glovebox or under aN, or Ar atmosphere using stand- 
ard Schlenk techniques. The solvent 1,2-difluorobenzene (DFB) was 
deoxygenated by purging with argon for 1 hand dried using a com- 
mercial solvent purification system designed by JC Meyer Solvent 
Systems. The solvents 1,2-dimethoxyethane (DME) and bis(2-meth- 
oxyethyl) ether (diglyme) were purchased from Sigma-Aldrich, dried 
over Na/benzophenone (DME) or 4A molecular sieves (diglyme), and 
degassed via three successive freeze-pump-thaw cycles. The compound 
2,2’-bipyridine-5,5’-dicarboxylic acid (H,bpydc) was synthesized using 
a previously published procedure”. The compounds ZrCl,, Ni(DME)Br,, 
Ni(DME)CL,, NiBr,, CoCl, and FeCl, were purchased from commercial 
vendors (Sigma-Aldrich for ZrCl,, Ni(DME)Br, and Ni(DME)CI,; Strem 
for NiBr,, CoCl, and FeCl,) and used as received. All other chemicals 
were purchased from commercial vendors and used as received unless 
otherwise noted. Inductively coupled plasma optical emission spec- 
trometry (ICP-OES) analysis was performed ona Perkin Elmer Optima 
7000 DV instrument at the University of California, Berkeley, micro- 
analytical facility. UV-Vis diffuse reflectance spectra were collected 
using a CARY 5000 spectrophotometer interfaced with Varian Win 
UV software. The samples were contained in a Praying Mantis air-free 
diffuse reflectance cell and dispersed in non-absorbing BaSO, matrix. 
The Kubelka—Munk conversion (F(R) versus wavenumber) of the raw 
diffuse reflectance spectrum (R versus wavenumber) was obtained by 
applying the formula F(R) = (1— R)?/2R. 


Synthesis of Zr,0,(OH),(bpydc), (1) 
This material was synthesized as a microcrystalline powder using a 
previously published procedure”. Typically, a2 1 round bottom flask 
equipped with a Schlenk adaptor, glass stoppers and a magnetic 
stir bar was charged with H,bpydc (6.11 g, 25.0 mmol), benzoic acid 
(224 g, 2.00 mol), and N,N-dimethylformamide (DMF; 1.001) froma 
newly opened bottle. The resulting mixture was purged with dry Ar 
for 30 min. Solid ZrCl, (5.83 g, 25.0 mmol) was then added, after which 
the mixture was purged with dry Ar for an additional 30 min. Deionized 
water (820 pl, 45.5 mmol) was added and the mixture was heated with 
magnetic stirring for 5 days at 120 °C under aN, atmosphere. After 
allowing the mixture to cool to room temperature, the solvent was 
decanted and the resulting white microcrystalline powder was washed 
by soaking three times in 11 aliquots of fresh DMF for 24 h at 120 °C, 
followed by solvent exchange with tetrahydrofuran (THF) via Soxhlet 
extraction for 3 days. The THF-solvated powder was filtered under dry 
Ar, followed by heating at 120 °C under dynamic vacuum for 24 hto give 
fully desolvated 1. The powder X-ray diffraction pattern and Langmuir 
surface area (2,700 m’ g"; N,, 77 K) of the material were found to be 
consistent with those reported in the literature”. 

Single crystals of 1 were synthesized following a previously reported 
procedure” and characterized by single-crystal X-ray diffraction. 

Note that refinement of the linker occupancies in the single-crystal 
and powder X-ray diffraction structures resulted in occupancies that 
range from 76.8% to 100%, consistent with previous reports of missing 
linker defects in Zr,0,(OH),(bpydc),° and other zirconium metal- 
organic frameworks”. 


General procedure for loading 1 with Nix, in diglyme 

X =CI, Br. Single crystals of 1 (<0.1 mg) suspended in diglyme were 
transferred into a 4 ml PTFE-capped vial. Most of the solvent was 
decanted, followed by addition of excess metal source (Ni(DME)CL,, 
Ni(DME)Br,, or NiBr,; 5-10 mg; >50 equiv) and diglyme (3 ml). The mix- 
ture was allowed to react for 1 month at 120 °C, resulting in a colour 
change of the crystals to pale yellow. The crystals were then character- 
ized by single-crystal X-ray diffraction. 


Stoichiometric reactions were performed on microcrystalline 
powder samples of lin the presence of single crystals that were later 
characterized by crystallography. Single crystals of 1 (<O.1 mg) sus- 
pended in diglyme were transferred into a4 ml PTFE-capped vial. Most 
of the solvent was decanted, followed by addition of microcrystalline 
1(60 mg), metal source (1.0-3.25 equiv Ni(DME)CI,, Ni(DME)Br,, or 
NiBr, per bpydc* in microcrystalline 1) and diglyme. The mixture was 
allowed to react for 1 month at 120 °C, resulting in a colour change of 
both the crystals and the powder to pale yellow. Most of the solution 
was removed by pipette and the crystals were subsequently soaked 
three times in3 ml of fresh DME at room temperature (~32 °C) for 24h. 
In cases where unreacted metal halide solids were observed, these were 
removed by carefully transferring a slurry of the framework into anew 
vial before each wash. A slurry containing most of the microcrystalline 
powder was separated from the crystals and pipetted into a new vial, 
after which the solvent was removed under reduced pressure at 80 °C to 
give a microcrystalline powder sample of the NixX,-loaded framework. 
The remaining single crystals were then used for single-crystal X-ray 
diffraction experiments. 


General procedure for loading 1 with MX, in 10% (v/v) DMEin 
DFB 

MX, =FeCl,, CoCl, and NiBr,. Single crystals of 1(<0.1 mg) suspended in 
diglyme were transferred into a thick-walled borosilicate tube. Most of 
the solvent was decanted, followed by addition of excess metal source 
(FeCl,, CoCl, or Ni(DME)Br,; 5-10 mg; >50 equiv), DME (0.30 ml) and 
DFB (2.70 ml). The reaction mixture was degassed by three freeze- 
pump-thaw cycles, after which the tube was flame-sealed and then 
placed in an oven preheated to 120 °C. The mixture was allowed to 
react for 1 month at this temperature, resulting in a colour change 
of the crystals (purple for FeCl,, blue for CoCl,, and pale yellow for 
Ni(DME)Br,). The crystals were then characterized by single-crystal 
X-ray diffraction. 

Stoichiometric reactions were performed on microcrystalline pow- 
der samples of linthe presence of single crystals that were later char- 
acterized by crystallography. Single crystals of 1 (<0.1mg) suspended 
in diglyme were transferred into a thick-walled borosilicate tube. Most 
of the solvent was decanted, followed by addition of microcrystalline 
1(60 mg), metal source (3.25 equiv FeCl,, CoCl,, or Ni(DME)Br,), DME 
(0.30 ml), and DFB (2.70 ml). The reaction mixture was degassed by 
three freeze-pump-thaw cycles, after which the tube was flame-sealed 
and then placed in an oven preheated to 120°C. The mixture was allowed 
to react for 1 month at this temperature, resulting in a colour change 
of both the crystals and the powder (purple for FeCl,, blue for CoCl,, 
and pale yellow for Ni(DME)Br,). Most of the solution was removed by 
pipette and the crystals were subsequently soaked three times in 3 ml 
of fresh DME at room temperature (~32 °C) for 24 h. In cases where 
unreacted metal halide solids were observed, these were removed by 
carefully transferring a slurry of the framework into a new vial before 
each wash. A slurry containing most of the microcrystalline powder was 
separated from the crystals and pipetted into a new vial, after which 
the solvent was removed under reduced pressure at 80 °C to givea 
microcrystalline powder sample of the MX,-loaded framework. The 
remaining single crystals were then used for single-crystal X-ray dif- 
fraction experiments. 


Single-crystal X-ray diffraction 

X-ray diffraction analysis was performed on single crystals coated with 
Paratone-N oil and mounted ona MiTeGen loops. The crystals were 
frozen at 100 K by an Oxford Cryosystems Cryostream 700 Plus. Data 
were collected at beamline 11.3.1 at the Advanced Light Source at Law- 
rence Berkeley National Laboratory using synchrotron radiation 
(A=0.8856 and 0.9537 A) onaBruker D8 diffractometer equipped with 
either a Bruker PHOTONIOO CMOS detector or a Bruker PHOTON II 
CMOS detector. Raw data were integrated and corrected for Lorentz 


Article 


and polarization effects using Bruker AXS SAINT software®’. Absorption 
corrections were applied using SADABS™. Initial evaluation of the dif- 
fraction data suggested that Zr,0,(OH),(bpydc), undergoes a change 
of space group from Fm3mto P2,3 (no. 225 and 198, respectively) upon 
loading with NiBr,, NiCl,, CoCl,, or FeCl,. Based on previous work”, 
attempts to solve and refine these structures in P2,3 resulted in unsat- 
isfactory refinement, thus solution and refinement in the space group 
Pa3(no. 205) was instead attempted. Inthe end, the latter space group 
gave the most satisfactory refinement. The structure was solved using 
direct methods with SHELXS**” and refined using SHELXL” operated 
in the OLEX2°8 interface. No significant crystal decay was observed 
during data collection. Thermal parameters were refined anisotropi- 
cally for all non-hydrogen atoms. Hydrogen atoms were placed in ideal 
positions and refined using a riding model for all structures. Moving 
from Fm3mto Pa3 results in two twin domains related by the lost mir- 
ror symmetry along the body diagonals of the unit cell. Consequently, 
atwinlaw (TWIN0101000 0-12; BASF = 0.50) was required for the 
structural refinement. 

The metal-organic framework Zr,0,(OH),(bpydc), is derived from 
Zr,0,(OH),(bdc), or UiO-66, which has been known to have structural 
defects where some of the linkers are absent”. Therefore, the linker 
occupancies in all structures were allowed to refine freely, resulting 
in occupancies that range from 76.8% to 100%. When the ligand is not 
present, water/hydroxide is known to replace it in the cluster”. These, 
however, could not be modelled in the structure due to their disorder 
and low occupancy. The oxygen atoms of the oxo and hydroxo groups 
onthe zirconium clusters in the structure were disordered and, incases 
where this disorder could be modelled, the site occupancy factors of 
these oxygen atoms were fixed to give a chemical occupancy of 50%. 
Hydrogen atoms on the hydroxo groups could neither be found nor 
placed and were omitted from the refinement but not from the formula. 
Disorder of the linkers and the metal halides in some of the structures 
required the use of geometric and displacement parameter restraints. 
Voids in the structures that result from disordered solvent that could 
not be modelled, large anisotropic displacement parameters that result 
from linker and solvent disorder, and, in some cases, low data resolu- 
tion gave rise to several A and B level alerts from checkCIF. Responses 
addressing these alerts have been included in the crystallographic 
information files (CIFs) and can be read in reports generated by check- 
CIF. Extensive solvent disorder was found in the pores for most of the 
structures and could not be modelled. Consequently, the unassigned 
electron density in these structures was accounted for using SQUEEZE*° 
as implemented in the PLATON“ interface. 


Powder X-ray diffraction 

Powder X-ray diffraction patterns were collected on microcrystalline 
powder samples of 1(FeCl,),, 1(CoCI,),s, 1(NiCI,),;and 1(NiBr,),;, which 
were loaded into 1.0 mm boron-rich glass capillaries inside a N,-filled 
glovebox and then flame-sealed. High-resolution synchrotron X-ray 
powder diffraction data were subsequently collected at 298 K witha 
wavelength of 0.45220 A at beamline 17-BM-B at the Advanced Photon 
Source at Argonne National Laboratory. For all samples, a standard 
peak search, followed by indexing through the Single Value Decompo- 
sition approach”, as implemented in TOPAS-Academic”, allowed the 
determination of approximate unit cell parameters. Analysis of the 
patterns of all samples led to the assignment of the space group Pa3 
on the basis of systematic absences. The unit cells and space group 
were verified by structureless Pawley refinements. In 1(NiCI,),; and 
1(NiBr,),; it was observed that there was broadening of the AKI reflec- 
tions arising from the lowering of symmetry to Pa3 from Fm3m. Spe- 
cifically, it was noted that the reflections corresponding toa mixture of 
even and odd hkl values were broadened, such as the (0 21) and (211) 
reflections at 2.2° and 2.4°. In the Pawley refinements, this broadening 
could be modelled by defining one Lorentzian function for the peaks 
corresponding to the Fm3mspace group (all odd or all even hkl values) 


and another Lorentzian convolution for the broadened peaks. Doing 
so led to an excellent fit, and the parameters for the peak shapes were 
implemented in later Rietveld refinements using the data of 1(NiCI,),; 
and 1(NiBr,),;, leading to improvements of ~8% and 4% in the weighted 
profile R-factor (R,,,), respectively, when compared to refinements 
performed without the broadening correction. 

Subsequently, Rietveld refinements of all samples were attempted, 
using the structural models determined by single-crystal X-ray diffrac- 
tionas starting points. The atomic positions were initially not refined. 
Occupancies of the metallated species (for example, Fe, Co, Ni, Cland 
Br) were allowed to vary relative to the full occupancy of Zr. These atoms 
were also given isotropic atomic displacement parameters that were 
individually refined. A separate occupancy factor and an isotropic 
atomic displacement parameter were given to all atoms of the bipy- 
ridine ligand, consistent with practices done for the single-crystal 
structural model refinements. Finally, the cluster oxygen atoms were 
given separate atomic displacement parameters and their occupancies 
were refined relative to the Zr occupancy. 

Inthe case of samples 1(FeCl,),. and 1(CoCI,),., it was found that when 
a Fourier difference map was generated from the single-crystal model 
relative to the observed pattern, disordered electron density near the 
extraneous Fe or Co species on the periphery of the sheet could be 
observed. This disorder is postulated to arise from the presence of a 
mixture of metal halide species and DME-solvated metal halide species 
being present in the material, and was modelled in the single-crystal 
model by reducing the occupancy of the chloride ligands of the periph- 
eral species in order to accommodate partial oxygen occupancy. An 
improved fit in both iron and cobalt cases was obtained when a DME 
molecule was modelled as a rigid body and allowed to relax using a 
simulated annealing approach, while keeping the rest of the structural 
model constant. In both cases, a DME molecule could be found local- 
ized near the peripheral metal species, with one of the oxygen atoms of 
the DME molecule in bonding distance to the metal (~2.0 A) and close 
to achloride position, consistent with the differences in typical bond 
lengths between these ligands. Although disorder probably contrib- 
utes to the high relative occupancies, the location is consistent with 
unresolved electron density observed in the single-crystal models. 

In the course of all refinements, the atom positions could not be 
refined freely, as they resulted in chemically unreasonable positions 
for numerous components of the structural model (particularly the 
bipyridine linker). As a result, in the final stages of the refinement, soft 
constraints were placed on the atomic positions (with the exception of 
those for H, which were not refined). The thermal parameters, sample 
and instrument parameters were then fit together with the background 
parameters. The resulting calculated diffraction pattern for the final 
structural models of 1(FeCl,),o, 1(CoCI,),s, 1(NiCI,),; and 1(NiBr,),; are 
in excellent agreement with the experimental diffraction patterns 
(Rietveld plots shown in Supplementary Figs. 5-8 and further crystal- 
lographic details given in Supplementary Tables 10 and 11). 

Finally, the refined occupancies of the metal(II) halide sheet atoms 
(Fe, Co, Ni, Cl, and Br) are within one standard error of the values 
obtained by single-crystal X-ray diffraction of closely related sam- 
ples, confirming that the structural models used are reasonable and 
applicable to the bulk samples. 


Low-pressure gas adsorption measurements 

Gas adsorption isotherms for pressures in the range O-1.2 bar were 
measured by a volumetric method using a Micromeritics ASAP2420 
instrument. A typical sample, consisting of -100 mg of material was 
transferred to a pre-weighed analysis tube, which was capped with 
a Micromeritics TranSeal and evacuated by heating at 120 °C for Lor 
80 °C for all samples loaded with metal(11) halides at a ramp rate of 
1°C per min under dynamic vacuum until an outgas rate of less than 
3 pbar min“ was achieved. The evacuated analysis tube containing the 
degassed sample was then carefully transferred to an electronic balance 


and weighed again to determine the mass of sample. The tube was then 
transferred back to the analysis port of the gas adsorption instrument. 
The outgas rate was again confirmed to be less than 3 pbar min“. Forall 
isotherms, warm and cold free space correction measurements were 
performed using ultra-high purity He gas (UHP, 99.999% purity); N, 
isotherms at 77 K were measured in liquid N, baths using UHP-grade gas 
sources. Oil-free vacuum pumps and oil-free pressure regulators were 
used for all measurements to prevent contamination of the samples 
during the evacuation process or of the feed gases during the isotherm 
measurements. Langmuir and Brunauer-Emmet-tTeller (BET) surface 
areas were determined fromN, adsorption data at 77 K. 


Magnetic measurements 
Samples were prepared by adding crystalline powder compound 
toa5mm1.D. (7 mm O.D.) quartz tube containing a raised quartz 
platform. Solid eicosane was added to cover the sample to prevent 
crystallite torqueing and provide good thermal contact between the 
sample and the cryostat. The tubes were fitted with Teflon sealable 
adapters, evacuated ona Schlenk line, and flame-sealed under static 
vacuum. Following flame sealing, the solid eicosane was melted ina 
water bath held at 40 °C. Magnetic susceptibility measurements were 
performed using a Quantum Design MPMS2-XL SQUID magnetom- 
eter. d.c. magnetic susceptibility measurements were collected inthe 
temperature range 2-300 K under applied magnetic fields of 0.01 T, 
0.1T and 1 T. Magnetic hysteresis measurements were performed 
at a sweep rate of 9 mT s‘. Diamagnetic corrections were applied to 
the data using Pascal’s constant to give xp) = -0.00177772 emu/mol 
(1(NiCI,);5), Xp = -0.00207772 emu/mol (1(NiBr;),5), ¥p = -O0.00205272 
emu/mol (1(FeCl,),), Xp = ~-0.00196972 emu/mol (1(CoCI,),,), and 
Xp = —-0.00024306 emu/mol (eicosane). 

The Mydosh parameter was calculated by extracting the slope of the 
T; vs log(v) plot, normalized against 7,(0). The freezing temperature, 
T,, is defined as the peak maximum in y’ at each frequency. The freez- 
ing temperature 7,(0) is calculated by extrapolating the peak in x’ to 
log(v) = 0 (ref. ”). 


Electron microscopy 

Transmission electron microscopy (TEM) was performed on an FEI 
Titan 80-300 kV microscope operating at 300 kV at the National Center 
for Electron Microscopy. Annular dark field scanning TEM images and 
energy dispersive X-ray spectroscopy (EDS) maps were acquired using 
abeam current of 100-300 pA at room temperature. The four EDS sili- 
con drift detectors had a collection solid angle of ~0.7 sr. Images were 
acquired before and after the EDS map to confirm that the sample did 
not damage visibly due to the electron beam. 

Scanning electron microscopy (SEM) was performed on an FEI 
Quanta Dual Beam FIB 0.5-30 kV microscope operating at 20 kV at 
the Biomolecular Nanotechnology Center at UC Berkeley. Energy- 
dispersive X-ray spectroscopy (EDS) maps were obtained at room tem- 
perature using an Oxford EDS detector attached to the SEM. 


M6ssbauer spectral measurements 

The Mossbauer spectra of 1(FeCl,),. were obtained between 5 and 295 K 
witha SEE Mossbauer spectrometer equipped witha Co-57 in Rh source. 
The isomer shifts are given relative to a-iron at 295 K. The spectral 
absorbers were prepared in an N, atmosphere glove box by packing 
the powder sample into a2.54 cm diameter polypropylene washer that 
was sealed with three layers of packing tape. The samples where then 
transferred to the spectrometer, where the absorber was maintained 
ina He atmosphere in order to prevent oxidation or decomposition. 


General procedure for metal content analysis via ICP-OES 
Roughly 10 mg of activated material was placed in a 20 ml plastic vial 
and digested with 10 pl of concentrated HF in2 ml of dimethylsulfoxide 


and diluted with 18 ml of 5% HNO, in Millipore water. The resulting solu- 
tion was transferred to a 100 ml volumetric flask and diluted to mark 
with 5% (v/v) aqueous HNO, in Millipore water to give a stock solution 
that contained roughly 25 ppm Zr from the sample. The stock sample 
solution (10.0 ml) and 2.50 ppm Y (1.00 ml) were added to a 25.0 ml 
volumetric flask and diluted to mark with 5% (v/v) aqueous HNO, to 
give sample solution that is ~10 ppm Zr with 0.100 ppm Y as an internal 
standard. Standard solutions with 0.100, 1.00, 5.00, 10.0 and 15.0 ppm 
Zr, Ni, Feand Co with 0.100 ppm Y as an internal standard were prepared 
for the calibration curve. 
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Extended Data Fig. 1| High-angle annular dark field images and energy- 
dispersive spectroscopy data for 1(NiBr,),;.a, High-angle annular dark field 
(HAADF) image (a, top left) and energy-dispersive X-ray spectroscopy (EDS) Zr 
(a, top right; yellow), Ni (a, bottom left; green) and Br (a, bottom right; red) 
mapping ofa microcrystalline powder sample of 1(NiBr,),;.b, STEM-EDS line 


10 42 anes ; 16 18 ; 20 
keV 
scan analysis for Ni (red) and Zr (yellow) across the length of the crystallite 
plotted as normalized atom per cent. The average amount for the two elements 
was determined to be 72 + 12% for Ni and 28 +12% for Zr, corresponding toa 
Ni:Zr ratio of 2.6.c, EDS spectrum for the crystallite of 1(NiBr,),;. Signals for Cu 
and Au originate from the space-filling washer and sample grid, respectively. 
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Extended Data Fig. 2| High-angle annular dark field images and energy- 
dispersive spectroscopy data for 1(NiCI,),,.a, High-angle annular dark field 
(HAADF) image (a, top left) and energy-dispersive X-ray spectroscopy (EDS) Zr 
(a, top right; yellow), Ni (a, bottom left; green) and Cl (a, bottom right; red) 
mapping ofa microcrystalline powder sample of 1(NiCI,),;.b, STEM-EDS line 
scan analysis for Ni (green) and Zr (yellow) across the length of the crystallite 
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plotted as normalized atom per cent. The average amount for the two elements 
was determined to be 70.7 +11% for Niand 29 + 11% for Zr, corresponding toa 
Ni:Zr ratio of 2.4.c, EDS spectrum for the crystallite of 1(NiCI,),;. Signals for Cu 
and Au originate from the space-filling washer and sample grid, respectively. 
cps, counts per second. 
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Extended Data Fig. 3| High-angle annular dark field images and energy- 
dispersive spectroscopy data for 1(CoCl,),,. a, High-angle annular dark field 
(HAADF) image (a, top left) and energy-dispersive X-ray spectroscopy (EDS) Zr 
(a, top right; yellow), Co (a, bottom left; violet) and Cl (a, bottom right; green) 
mapping ofa microcrystalline powder sample of 1(CoCl,),.. b, STEM-EDS line 


10 12 14 16 18 20 
keV 

scan analysis for Co (violet) and Zr (yellow) across the length of the crystallite 

plotted as normalized atom per cent. The average amount for the two elements 

was determined to be 75 +13% for Co and 25+ 13% for Zr, corresponding toa 

Co:Zr ratio of 3.0.c, EDS spectrum for the crystallite of 1(CoCl,),.. Signals for Cu 

and Au originate from the space-filling washer and sample grid, respectively. 
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Extended Data Fig. 4 | High-angle annular dark field images and energy- scan analysis for Fe (orange) and Zr (yellow) across the length of the crystallite 
dispersive spectroscopy data for 1(FeCl,),,.a, High-angle annular dark field plotted as normalized atom per cent. The average amount for the two elements 
(HAADF) image (a, top left) and energy-dispersive X-ray spectroscopy (EDS) was determined to be 77 + 9% for Fe and 23 + 9% for Zr, corresponding toa Fe:Zr 
mapping Zr (a, top right; yellow), Fe (a, bottom left; orange) and Cl (a, bottom ratio of 3.3.c, EDS spectrum for the crystallite of 1(FeCl,),,. Signals for Cuand 
right; green of a microcrystalline powder sample of 1(FeCl,),,.b, STEM-EDS line Au originate from the space-filling washer and sample grid, respectively. 
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Extended Data Fig. 6 | a.c. magnetic susceptibility data. a, b, In-phase (a) and field oscillating at frequencies of 1,5, 7.5, 10, 50, 75, 100, 500, 750 and 1,000 Hz 
out-of-phase (b) variable-temperature a.c. magnetic susceptibility data from2 (blue to red). Coloured lines are guides for the eye. 
to10K for 1(FeCl,),, under zero d.c. magnetic field anda 0.4 mT a.c. magnetic 
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Extended Data Fig. 7 | Méssbauer spectra. a, b, °’Fe Mossbauer spectra for 
1(FeCl,),. at 100 K (a) and 5K (b). The data were fit with four high-spin 
octahedral Fe(II) components (green), two high-spin four- and five-coordinate 
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Fe(II) components (red), anda magnetic hyperfine component (grey). 
Spontaneous oxidation leads to a high-spin Fe(111) impurity (blue) visible at 5 K. 
Overall fits are depicted in black. 
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More than one-third of Earth’s landmass is drained by rivers that seasonally freeze 
over. Ice transforms the hydrologic’, ecologic*“, climatic’ and socio-economic® ® 


functions of river corridors. Although river ice extent has been shown to be declining 
in many regions of the world’, the seasonality, historical change and predicted future 
changes in river ice extent and duration have not yet been quantified globally. 
Previous studies of river ice, which suggested that declines in extent and duration 
could be attributed to warming temperatures®””, were based on data from sparse 
locations. Furthermore, existing projections of future ice extent are based solely on 
the location of the 0-°C isotherm”. Here, using satellite observations, we show that 
the global extent of river ice is declining, and we project a mean decrease in seasonal 
ice duration of 6.10 + 0.08 days per 1-°C increase in global mean surface air 
temperature. We tracked the extent of river ice using over 400,000 clear-sky Landsat 
images spanning 1984-2018 and observed a mean decline of 2.5 percentage points 
globally in the past three decades. To project future changes in river ice extent, we 
developed an observationally calibrated and validated model, based on temperature 
and season, which reduced the mean bias by 87 per cent compared with the 0-degree- 
Celsius isotherm approach. We applied this model to future climate projections for 
2080-2100: compared with 2009-2029, the average river ice duration declines by 
16.7 days under Representative Concentration Pathway (RCP) 8.5, whereas under RCP 
4.5 it declines on average by 7.3 days. Our results show that, globally, river ice is 
measurably declining and will continue to decline linearly with projected increases in 
surface air temperature towards the end of this century. 


River ice, which is widespread at middle to high latitudes and eleva- 
tions”, regulates many aspects of river functions. For example, river 
ice contributes to the seasonal ice road network, which serves remote 
Arctic communities“. During the spring melt, ice-jam floods cost about 
US$300 million in 2017 in North America alone’. Although disruptive to 
humans, ice-jam flooding has an ecologically beneficial role, distribut- 
ing fresh water, sediments and nutrients to riparian ecosystems*. River 
ice is also thought to regulate greenhouse gas emissions from rivers 
to the atmosphere by seasonally blocking an estimated 87,000 km? 
of stream surface’. 

Despite the wide-ranging importance of river ice, knowledge of 
its global extent and change is extremely limited. Three studies have 
investigated historical river ice extent in the Northern Hemisphere: 
the first’ estimated changes in river ice phenology from 1979 to 2009 
witha physically based model; the second” estimated that 56% of rivers 
were affected by ice cover, using the 0-°C surface air temperature (SAT) 
isothermasa proxy for river ice; and the third study” found consistent 
trends of later surface water freeze-up (5.7 days later per 100 years) and 
earlier break-up (6.3 days earlier per 100 years) based on long-term 
records of ice occurrence from 5 rivers and 21 lakes. Various rates of 
changes have been observed from local to regional records’”"*”, but 
extrapolating these observations globally is challenging because of 
poor spatial coverage and, more importantly, the spatially heteroge- 
neous nature of ice dynamics revealed by evaluations of ice break-up 


dates along river profiles'’*””. Moreover, trends from in situ observations 
are inconsistent owing to differences in the definitions of phenologi- 
cal dates, changes in instrumentation and the selection of study sites 
and analysis periods’. Of the few studies that have predicted future 
changes in river ice extent, most have been based on simple ice-SAT 
relationships derived from in situ records and conducted at regional 
scales"”°, To accurately project future changes in river ice extent at 
the global scale, a robust and comprehensive understanding of the 
relationship between climate and ice extent is required”. 

In this study, we present a global, multitemporal river ice extent 
dataset, based on 407,880 satellite images from 34 years of observa- 
tions from the Landsat 5-8 missions (1984-2018). Analysis reveals 
patterns of change in global river ice cover and enables the develop- 
ment and validation of a simple, yet highly predictive, empirical model 
of river ice extent. Applying the model to future climate projections, 
we forecast end-of-century changes in the global extent and seasonal 
duration of river ice cover. 

Toconstruct a global multitemporal river ice extent dataset, we first 
identified 7.5 million river centreline locations observable by Landsat 
witha width =90 mand awater occurrence 290% (refs. *”), largely cor- 
responding to rivers with stream order 23 (ref. ™). To calculate river ice 
extent, we then extracted snow/ice conditions from the quality band of 
Landsat images on the Google Earth Engine” platform. Snow/ice inthe 
quality band was classified by the US Geological Survey using the Fmask 
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Fig. 1| Extent of river ice from 1984 to 2018. a, Map of mean river ice extent (in 
terms of ice-covered length percentage) for the winter season—boreal winter 
(December, January and February) for the Northern Hemisphere and austral 
winter (June, July and August) for the Southern Hemisphere. The bar plot shows 
the monthly percentage of ice-covered rivers globally. The percentage of 
studied rivers observed successfully by Landsat is shown in parentheses. 


algorithm”®, which labels each pixel as clear, water, cloud, cloud shadow 
or snow/ice. To reduce the volume of data, we aggregated pixel-level 
snow/ice conditions into the percentage of total river length covered 
by ice, or river ice extent, for each Landsat image. To our knowledge, 
the result constitutes the first global multitemporal quantification of 
river ice extent. 

The main source of uncertainty in the river ice extent dataset comes 
from the classification error of snow/ice in Fmask. Although the spec- 
tral method for classifying snow/ice was adapted from other optical 
sensors that have been validated’, the snow/ice classification in Fmask 
has not previously been systematically evaluated for Landsat images. 
By comparing Fmask-derived river ice extent to in situ river ice records 
in Alaska (from the US National Weather Service) and Canada (from 
the Water Survey of Canada), we estimated the overall accuracy of 
the Fmask-derived river ice extent to be 0.94 (P< 0.001; see details 
in Methods). 

Using the global river ice extent dataset, we calculated large-scale 
river ice coverage and estimated its recent changes. Globally, we esti- 
mated a maximum ice extent of 56% for the 94% of rivers that were 
successfully observed in March (Fig. 1a). The distribution of river ice 
was strongly asymmetric between hemispheres. In the Northern Hemi- 
sphere, where other studies have estimated the maximum extent of 
river ice, we found that 66% of the observed river length in March was 
ice-covered, about 18% higher than previous estimates” (note that 4% 
of the targeted rivers were not successfully observed in March owing to 
insufficient data). Inthe Southern Hemisphere, river ice was detected 
only in New Zealand, the southern tip of the Andes in South America 
and the southernmost part of Australia. River ice was found at the 


70 | Nature | Vol577 | 2 January 2020 


River ice extent (%) 


River ice extent change 
(percentage points) 
Eo 
20 -10 0 10 20 
b, Map of changing river ice conditions between 1984-1994 and 2008-2018. 
Changes were calculated at a5° x 5° tile scale instead of at the Landsat tile scale 
used inato increase data availability. The bar plot shows the monthly river ice 
change with the percentage of studied rivers successfully observed by Landsat 
in parentheses. In both maps, the black area denotes either insufficient data or 
alack of Landsat-observable rivers. 


lowest latitudes in continental regions with high topographies, such 
as the Rocky Mountains in North America and the Tibetan Plateau in 
Asia. Conversely, less ice was detected over relatively high latitudes in 
Western Europe and the Pacific Northwest of the United States, prob- 
ably because of the influence of nearby ice-free oceans. 

Comparing observed river ice cover between 2008-2018 and 
1984-1994, we detected a monthly global decline ranging from 0.3 to 
4.3 percentage points (Fig. 1b; note that the percentage point change 
and the percentage change are different—that is, moving from 10% 
to 7.5% would be a 2.5 percentage point change, but a 25% change). 
The magnitude of decline was lower during July-September, when 
river ice is least prevalent. The majority of the changes inthe northern 
mid-to high latitudes are towards less river ice cover, with the greatest 
declines around the Tibetan Plateau, eastern Europe and Alaska. The 
monthly river ice change was calculated wherever data were available 
for both decades (see Extended Data Figs. 1, 2), accounting for 47-75% 
of the global rivers successfully observed by Landsat, depending on 
the month. 

The observed decline in river ice is likely to continue with predicted 
global warming. By matching the river ice extent dataset with a 30-day 
prior mean SAT from the ERAS climate reanalysis dataset’, we dem- 
onstrate that river ice extent can be accurately represented, based on 
temperature and season, by a logistic regression model (Fig. 2a; root- 
mean-square error, RMSE: 13.8 percentage points; mean bias (MBS), 
0.6 percentage points). Within the critical temperature range (-10 °Cto 
10 °C) forice-water transition, our model reduced the RMSE by 30% and 
MBS by 87% compared with the 0-°C-isotherm model (Fig. 2b). Using 
this model, we also found that, as suggested by a previous regional 
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Fig. 2| Modelling river ice extent. a, Logistic regression model (lines) 
constructed from the relationship between river ice extent and 30-day prior 
mean SAT, with the period encompassing break-up (August-January) and 
freeze-up (February-July) treated separately. b, Comparison of river ice models 
by mean bias and RMSE. 


correlation analysis”, SAT is a stronger control during break-up than 
during freeze-up. 

Applying this river ice model to future SAT data for the end of the 
century (2080-2100) from the CESM climate model (under RCP 8.5)”°, 
we found that, compared with a 2009-2029 reference period (chosen 
to centre around the present year), monthly declines in global river 
ice extent ranged from 9 to 66% (0.4—9.3 percentage points) (Fig. 3 
shows the changes for the Northern Hemisphere). We found a simi- 
lar pattern for RCP 4.5, with a smaller magnitude of change (globally 
0.2-3.2 percentage points, corresponding to a4-35% decline, Extended 
Data Figs. 3, 4). 

We divided the global landmass into zones defined by ice duration: 
ice-free (duration <5 days), intermittent (5 days < duration <15 days), 
0.5-3 months, 3-6 months and >6 months (Fig. 4a). We found substan- 
tial areal decline for regions with ice duration of >6 months anda gen- 
eral shift of zones with shorter ice duration to higher latitudes. Across 
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Fig. 3 | Modelled average monthly river ice difference between 2009-2029 
and 2080-2100 using CESM SAT output (RCP8.5). The percentage point 
change over Northern Hemisphere is shown in red, with the corresponding 
percentage change in parentheses. White land areas denote a lack of Landsat- 
observable rivers. 
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Fig. 4 | Future changes in river ice duration. a, Modelled northward shifts of 
ice duration zones between 2009-2029 and 2080-2100 using CESM SAT 
output (RCP 8.5) (see Extended Data Fig. 5 for an estimation for RCP 4.5 and for 
the Southern Hemisphere). White land areas denote a lack of Landsat- 


the studied rivers, between 2009-2029 and 2080-2100, we estimated 
anaverage rate of decline in river ice duration of 23.5 days per century 
globally under RCP 8.5, with ice duration declining most severely inthe 
Rocky Mountains, the northeastern United States, eastern Europe and 
the Tibetan Plateau (Fig. 4b). As expected, the decline in river ice dura- 
tion under RCP 4.5 is less severe—the average decline in duration glob- 
ally is 10.3 days per century, arate slightly greater than that estimated 
for the twentieth-century Northern Hemisphere” (see Extended Data 
Fig. 5). Application of the river ice model to SAT from two other model 
simulations from the Coupled Model Intercomparison Project Phase 5 
(CMIPS) shows a similar magnitude of change (see Methods). We also 
estimated the sensitivity of global river ice change to the increase in 
global mean SAT and found that for each 1 °C increase in global mean 
SAT, mean ice duration is projected to decrease by 6.10 + 0.08 days 
(Fig. 4c), and the percentage of rivers affected by ice is projected to 
decrease by 1.48 + 0.03 percentage points (Extended Data Fig. 6). 
There are three primary implications of this study. First, our results 
reveal that more than half of Earth’s rivers are covered by ice during the 
winter months, signifying a wider influence of river ice than previous 
estimates. As river ice is thought to impede the emission of green- 
house gases normally released by rivers’, this upward revision implies 
a stronger seasonal signature in greenhouse gas emissions from the 
global river network. Second, projected future declines in river ice 
extent will transform the functions of Earth’s ice-affected rivers. For 
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observable rivers. b, Map of changes in river ice duration between 2009-2029 
and 2080-2100.c, The relationship between global mean river ice duration and 
the changes in global mean SAT. 


example, shortening ice durations will force the transition from land- 
based winter transportation to waterways in the high latitudes, where 
arecent study suggests a 14% reduction in the land area accessible by 
winter roads by mid-century’. The loss of river ice will also substantially 
alter ways of living for residents of ice-affected regions in terms of the 
cultural ecosystem services that ice provides®. Finally, our results dem- 
onstrate that, globally, the mean duration and maximum extent of river 
ice vary approximately linearly with mean SAT for the studied range of 
warming. Knowing these linear rates of change enables us to quickly 
and accurately estimate the changes in river ice extent and duration 
caused by future climate change, allowing more accurate propagation 
of its influence on the socio-economic, hydrologic, biogeochemical 
and ecological processes of the global river system. 
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Methods 


Data 
Multiple datasets have been used in this study, each of which is 
described in detail below: 

1. Global River Widths from Landsat (GRWL)” 

2. JRC surface water occurrence” 

3. Quality band Fmask” classifications of the Landsat collection1 
tier limages 

4, ECMWF ERAS”*’ surface temperature 

5. NEX-GDDP climate projection SAT data*” 

GRWL™”, or Global River Widths from Landsat, is a global river data- 
base that contains 58 million river centreline locations and widths. We 
used the GRWL Vector Product V01.01 (dataset link: https://zenodo. 
org/record/1297434#.W8JkshNKh24). Specifically, we used the fol- 
lowing properties: 

* Geometry (location): expressed as point geometry with latitude 
and longitude in EPSG:4326 projection. 

¢ width_m: used to identify rivers with a width of more than 90 m. 

¢lake_flag: indicate whether or nota centreline belongs to a river or 
alake or reservoir. 

¢nchannel: number of channels. GRWL tends to trace the overall river 
centre in multichannel or braided rivers, which sometimes overlaps 
with islands. We only used single channel rivers (by setting nchannel =1) 
in our study to avoid extracting ice status over the non-water areas. 

Global surface water occurrence map” contains a raster mapat30m 
spatial resolution with pixel values ranging from O to 100, indicating 
the percentage of times when water was detected at its locationinthe 
Landsat record. The map layer was constructed by classifying water 
and non-water for each of the global monthly mosaic images from 
Landsat 5, 7 and 8 between March 1984 and October 2015 (inclusive). 

Fmask” is a classification algorithm designed for Landsat images to 
classify each pixel into five different categories (clear, water, snow/ice, 
cloud, cloud shadow). It is competent at classifying cloud and cloud 
shadow, and its classification results have been incorporated into the 
quality band for all Landsat collection 1 tier 1 images. 

ERAS’ is a reanalysis product that incorporates historical records of 
land surface and atmospheric variables into the latest modelling frame- 
work to produce a global, gridded dataset at 31 km spatial resolution. 
So far, ERAS is available from 1979 at hourly or monthly temporal steps. 
We accessed the dataset from the European Centre for Medium-Range 
Weather Forecasts (ECMWF) website. We first downloaded the hourly 
global SAT variable (t2m) from 1 March 1984 to 31 December 2018 at 
a6 hintervals (0:00, 6:00, 12:00, 18:00). We then calculated the daily 
mean SAT by averaging these four hourly values. Finally, we calculated 
the time series of mean 30-day prior temperature and spatially joined 
it to each of the Landsat-derived river ice observations. 

NEX-GDDP, or the NASA Earth Exchange Global Daily Downscaled 
Climate Projections”, is spatially downscaled to 0.25° x 0.25° froma 
collection of lower-resolution climate projection results developed 
under the CMIPS framework. The entire collection contains model 
output from 21 climate models, each with RCP 4.5 and RCP 8.5 for daily 
minimum SAT, maximum SAT and precipitation. We calculated the daily 
mean SAT separately for both RCPs by taking the mean of the minimum 
and maximum SAT for three models—CESM1-BGC, GFDL-ESM2M and 
MIROC-ESM. We then calculated the daily 30-day prior mean SAT, which 
was then used to predict future river ice extent. 


Calculating the historical river ice cover dataset 

Processing GRWL for river ice cover calculation. GRWL contains 
approximately 58 million river centreline points globally, each in- 
cluding a width value. In multichannel or braided rivers, GRWL com- 
putes an effective centreline, the total flow width, and the number of 
channels at each centreline point. As these effective centrelines do 
not necessarily trace the actual river channels, we only used single 


channel GRWL centreline points (nchannels = 1, around 80% of rivers 
are single channel). Moreover, lakes and reservoirs are part of many 
river networks in GRWL, and the centrelines over these features are 
flagged. We only used non-lake centerline points in GRWL to limit the 
calculation of ice conditions to rivers, as ice dynamics may be different 
over lakes and reservoirs. Finally, while GRWL represents our latest 
knowledge of global river location and width, it is a static dataset, mak- 
ing it suboptimal for capturing ice condition on rivers over the 34 year 
period, during which varying degrees of morphological changes could 
occur. To alleviate this problem, we used only GRWL centreline points 
where surface water occurrence based ona previous study” is 90% or 
above, ensuring that the detected ice conditions for these points were 
from water surfaces. After all three filters are applied, our final river ice 
cover dataset used approximately 7.5 million GRWL centreline points, 
constituting around 271,599 km of river length. This subset of GRWL 
largely corresponds to Strahler—Horton stream orders greater than 3. 


Constructing the global river ice cover dataset. The acquisition of 
the Landsat Fmask classification (cloud, cloud shadow and snow/ice) 
was conducted on the Google Earth Engine platform” for all single- 
channel GRWL river centreline points with water occurrence =>90%. 
Specifically, we extracted the total number of centreline pixels, as well 
as the number of pixels covered by snow/ice, cloud and cloud shadow, 
for all images from Landsat TM, ETM+ and OLI sensors, ranging from 
March 1984 to December 2018. Then the per-image river ice fraction 
(Priver ice) aNd cloud fraction (P.joua/shadow) Were Calculated using the fol- 
lowing formula: 


Piver_ice = Nonow/ice/ (Notal cs Notoud ~ Nohadow) 


Putoud/shadow = (Nejoud + Nonadow)/N otal 


where Mota» Netouar Nohadow ANd N,now/ice denote the number of the total, 
cloud, cloud shadow and snow or ice pixels from a particular image. 
In total, we processed 841,365 Landsat images, covering 1984-2018. 
Calculating the per-image ice extent directly on Google Earth Engine 
greatly reduces the size of the dataset at no observable cost in terms 
of the details of the river ice extent required for this study. 


Cleaning the river ice dataset. We systematically excluded some 
river ice data before calculating and modelling historical changes. To 
increase the stability of the river ice fraction calculation, we excluded 
river ice data from images for which: (1) Paua/shadow iS greater than 25%; 
(2) Nyotal < 333 (around 10 km length of river); and (3) the percentage of 
river pixels affected by topographic shadow exceeds 5%. This filtering 
reduces the data volume from 841,365 to 407,880 images. 


Calculating global historical monthly mean river ice extent. We esti- 
mated global monthly mean river ice extent through two levels of spatial 
aggregation. For each month, we first calculated mean river ice extent for 
each WRS-2 tile (WRS, Worldwide Reference System) using all available 
Landsat-derived river ice extent observations across 34 years. Then we 
estimated the mean global river ice extent by calculating the weighted 
mean of thetile-level data. We estimated the weight for this aggregation 
by multiplying the length of studied rivers in the tile by the extent of over- 
lap between the current tile and its neighbouring tiles. Specifically, we 
estimated the percentage of studied rivers for each WRS-2 tile using the 
total number of river centreline points intersecting the corresponding 
tile; we then estimated the degree of tile overlap (denoted by /) by calcu- 
lating the proportion of non-overlapping area out of the total tile area. 


r=(A, -A,)/A; 


where A, is the area of WRS-2 tile and A; is the area of the intersection 
between two tiles. 


The monthly weighted mean river ice extent is shown in the bar chart 
accompanying Fig. la. For each month, we also estimated the percent- 
age of studied rivers actually captured by our satellite records. To esti- 
mate this monthly spatial coverage for each month, we divided the area 
of the union of all observed WRS-2 tiles for that month—representing 
the length of rivers observed—by the area of the union of all of the WRS-2 
tiles intersecting our studied rivers. This coverage percentage was 
reported in the bar chart in Fig. 1a. Note that it is necessary to calculate 
the union of the tiles before the total covered area as there is overlap 
between neighbouring tiles. 


Calculating historical changes in river ice extent 

Weassessed historical changes in river ice extent by calculating the dif- 
ference in mean monthly river ice cover between two decades: March 
1984-March 1994 and December 2008-December 2018-the starting 
and ending months were chosen to maximize the gap between the two 
decades. To compensate for the scarcity of Landsat records from the 
earlier decade and from high-latitude regions, the historical analy- 
sis—both the monthly statistics and the aggregated global map—was 
carried out by aggregating river ice data from the WRS-2 tile (roughly 
1° x 1° at the Equator) toa 5° x 5° tile. 


Calculating the global map of historical changes in river ice. To 
produce the map of the change in historical river ice extent for each 
month (Fig. 1b), we calculated the difference in mean river ice extent 
for each 5° x 5° tile. For each month, we kept only the tiles that con- 
tained at least five river ice observations for each of the two decades 
under comparison. The global map was then calculated by averaging 
all available monthly difference values for each tile. Monthly maps of 
the decadal difference in river ice extent can be found in Extended Data 
Fig. 2, which shows the temporal pattern of the change and the spatial 
coverage of the river ice record. 


Calculating global historical changes in monthly mean river ice 
extent. To estimate the global monthly difference in river ice extent 
for each month, we calculated the difference in mean river ice extent 
for 5° x 5° tiles with at least five river ice observations for each decade. 
The monthly difference was then calculated by averaging the mean dif- 
ference value from all available tiles, whereas the value of the observed 
percentage of rivers was estimated by taking the ratio between the total 
area of the available 5° x 5° tiles and the total area of all of the global 
5°x 5° tiles that intersecting studied rivers. These statistics are shown 
in the bar chart in Fig. 1b. 


Quantifying Landsat spatial and temporal sampling patterns. The 
aggregation done here to calculate historical changes in river ice could 
cause unintended systematic bias owing to the potential biases in the 
sampling time (within each month) and location (within each tile) be- 
tween the two decades. We conducted the following two assessments 
to show that (1) both the sampling date for each month and sampling 
location for each 5° x 5° tile were small compared with their respective 
range of possible values (mean sampling time difference: —0.115 days 
and standard deviation: 3.4 days; mean sampling location difference: 
0.012° and standard deviation: 0.41°) and (2) there was no correlation 
between the difference in sampling and the difference in the river ice 
extent, bothin time and location (Pearson correlation coefficient Fem. 
poral = 0.04 and F,,atiai= 0.07). The results of these two assessments can 
be found in Extended Data Fig. 7. 


Modelling river ice cover 

Building the river ice cover model. After exploring the relationship 
between river ice extent and its corresponding 30-day prior mean 
SAT, we chose logistic regression to model their relationship. Logistic 
regression assumes a linear relationship between the logarithm of 
the odds of a phenomenon (ice) and the predictors (the 30-day prior 


mean SAT (SAT;,) and a categorical predictor we designated PERIOD), 
which our data follow. We used the following equations to model the 
river ice extent. 


odds(ice) = Nonow/ice/ Nwater = Piver_ice/(1 =i Piver_ice) 


log(odds(ice)) = aSATag + DBSATgy X PERIOD + € 


The PERIOD predictor divides the data into two periods encompass- 
ing freeze-up (August—January = O) and break-up (February-July = 1). 
The rationale for adding the PERIOD predictor is based on the differ- 
ent control strengths of temperature over ice processes between the 
freeze-up and breakup periods—a pattern suggested from analysis of 
in situ records in Canada”’. 

We applied the regression model to the Landsat-derived river ice 
extent and ERAS-derived SAT,,.. The parameters were estimated as 
a=-—0.32, b=-0.05 and c=-0.82. Using the model, we were able to 
compare the strength of the control that SAT3, exerts on ice dynamics: 
we estimated that SAT,, control over break-up is stronger than that over 
freeze-up as b is negative. The entire dataset was used to assess the 
skill of the logistic model and the 0-°C-isotherm model (see Fig. 2b). 


Projecting river ice cover at the end of the century. We projected 
future river ice extent by applying the river ice model to the future SAT 
data produced by CMIPS climate projections. We used SAT; outputs 
from CESM1-BGC, GFDL-ESM2M and MIROC-ESM climate simulations 
under both RCP 4.5 and RCP 8.5 to estimate future river ice extent and 
duration up tothe end of the century. These models were chosen to ac- 
count for potential trend biases in predicted temperature. Ina similar 
way to evaluating the sensitivity of a model, which is common in the 
climate modelling community”, we calculated the mean global SAT dif- 
ference between the periods 2006-2036 and 2069-2099 for 21 models 
included inthe CMIP5 ensemble and selected three models to represent 
the variabilities in relative temperature change (see Extended Data 
Fig. 8a). Projected future declines in river ice extent and duration are 
summarized in Extended Data Fig. 8b. To project future ice conditions, 
we calculated daily river ice extent throughout the periods 2009-2029 
and 2080-2100, from which we then calculated, (1) monthly mean river 
ice extent and the difference between the two periods (Fig. 3, Extended 
Data Figs. 3, 4); (2) mean river ice duration (Fig. 4, Extended Data Fig. 5). 
Thesummary future changes in river ice extent and duration reported 
here were calculated by aggregating the values from the corresponding 
map of change at the locations of studied rivers. 


Estimating the relationship between river ice condition and global 
mean surface temperature. For each year between 2009 and 2099, 
we estimated percentage of ice-affected rivers and the mean ice dura- 
tionacross the globe. The annual percentage of ice-affected rivers was 
derived by calculating the annual mean river ice extent for each studied 
river location, then flagging it as ice-affected ifthe mean value exceeded 
0.041—15 days of effective ice cover over 365 days. The annual dura- 
tion for each river location was estimated by counting the number of 
days when projected river ice extent exceeded 50%. The annual global 
mean surface temperature was then computed by averaging the daily 
mean SAT temperature across the year and then aggregating across 
the globe. 


Sources of errors 

Errorsina global dataset—especially one that quantifies highly dynamic 
Earth surface processes—are often unavoidable. Through building 
the historical river ice dataset, modelling the river ice processes and 
predicting future river ice conditions, we have identified three major 
sources of errors: errors from misclassifications in Fmask, errors in 
SAT values in the ERAS dataset and errors in the projections of future 
river ice condition. 
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Qualitative evaluation of Fmask snow/ice classification. To quali- 
tatively assess the accuracy of the Fmask snow/ice classification, we 
randomly selected-—stratified sampling by temperature range and 
observed river ice cover—160 images for visual evaluation. We found 
that snow/ice classification is adversely affected in the following 
situations (from the most to the least frequent): 

* Commission error in classifying turbid water as snow/ice—found 
mostly over the Yellow River in China and Amu Darya in Turkmenistan. 
Less frequently found over the Red River, Arkansas River and Missouri 
River in the United States. 

* Commission error in classifying cloud as snow/ice—no strong spatial 
pattern found for this type of error. 

* Omission error in classifying topographically shaded snow/ice 
as water. 

* Omission error in classifying thin ice as water—this type of error is 
rare and mostly observed on still portions of rivers (suchas reservoirs 
and lakes, which are not used in this analysis). 


Evaluating Landsat-derived river ice extent. We use the US Geological 
Survey’s quality band snow/ice classification to derived river ice extent. 
The snow/ice classification is calculated using Fmask”°. Fmask classifies 
each pixel of the Landsat image into one of five classes. Although Fmask 
is considered the most accurate” cloud classification algorithm for 
Landsat images, its snow/ice classification has not been evaluated for 
accuracy. Here we used in situ reported river ice conditions to evaluate 
the accuracy of river ice extent calculated using Fmask. 

Although direct in situ river ice records are scarce, we were able to 
obtain the river ice status for the state of Alaska, United States, from the 
archive of the National Weather Service (NWS). Wealso inferred river ice 
conditions using the backwater flag that accompanies the daily gauge 
flow records from Water Survey of Canada (WSC). The backwater flag 
was used in previous studies as strong indicator of ice conditions”. 

In the following, we first explain our approach to extracting and 
cleaning the in situ datasets. We then introduce our method for cal- 
culating the river ice extent from Landsat images and matching it to 
the in situ observations. Finally, we show the results of the evaluations. 
NWS river ice observations. We obtained historical records for 485 
stations in Alaska from S. Lindsey at the Alaska-Pacific River Forecast 
Center. We encountered two challenges in using this dataset for evalua- 
tion. First, the files we obtained, while containing ice observations and 
station descriptions, do not contain the geolocations of these stations. 
Fortunately, the station description often followed the ‘river name_at/ 
near_location_name’ naming convention (for example, ‘Yukon River 
at Beaver’). We were able to manually identify 177 stations that have 
both freeze-up and break-up information for at least one water year, 
115 of which we successfully georeferenced and 13 of which we eventu- 
ally used for evaluation after excluding sites that either are close toa 
river that is too small for Landsat to observe or did not have records 
that temporally overlapped with the Landsat observation. Second, 
NWS reported multiple thresholds that indicate various ice conditions 
during both the freeze-up and break-up periods. However, there were 
varying amounts of missing data for these dates. For example, while 
the NWS directly reported freeze-up date, the majority of the values 
inthis field were missing data, which rendered it of very limited value. 
Instead, we used the first ice date as the dates of ice onset and ‘breakup’ 
as the date of ice-off. 

WSC flag. The WSC includes flags in its daily discharge data that indicate 
the state of flow conditions. Among these flags, the backwater flag or ‘B’ 
flag is used to indicate ice conditions”. In our evaluation, we followed 
existing practice, treating dates with B flags as dates of river ice cover. 
Matching in situ ice coverage with Landsat-derived ice coverage. After 
merging the geolocations of the NWS stations and the WSC stations, 
we calculated the river ice conditions for these locations according 
to Fmask classifications. Specifically, for each in situ location, we 


calculated the Fmask-derived river ice extent for GRWL rivers (nchan- 
nel=1;lake flag = 0; width_mean = 90 m) located within a1,500-m radius 
of the gauge. 

To evaluate the Landsat-derived ice coverage against the in situ 

records, we matched the datasets spatially (to the 1,500 m proximity 
of each station) and temporally (to the same day). The same-day tem- 
poral matching was straightforward for WSC records, as they reported 
daily ice conditions. However, as the NWS reported only dates of ice-on 
and ice-off, we treated dates that fell between an ice-on date and the 
following ice-off date as ice-covered dates, and those that fell between 
an ice-off and the following ice-on date as ice-free dates. In total, we 
matched 18,930 pairs (NWS-Alaska: 515 pairs over 13 sites; WSC: 18,415 
pairs over 139 sites) of in situ and Landsat-derived river ice observation 
for our evaluation. 
Evaluating Landsat-derived river ice coverage. When comparing the 
Landsat-derived river ice coverage to that reported from the field, we 
first converted the continuous values (O-100%) to a binary ice condi- 
tion using a threshold of 50%—ice coverage >50% is classified as ‘ice- 
covered’ and <50% is classified as ‘ice-free’. The 50% threshold was 
chosen as we found that that threshold choice had little impact onthe 
final evaluation. Then we calculated the accuracy, sensitivity and 
specificity by constructing a confusion matrix using the in situ reported 
ice condition as a reference and the Landsat-derived ice state as the 
observation. Overall, Landsat-derived river ice coverage was highly 
consistent with the in situ reports (accuracy = 0.94, sensitivity = 0.91, 
specificity = 0.96, Extended Data Fig. 9). When the analysis was broken 
down into monthly evaluations, accuracy was highest during summer 
months (June-August: mean accuracy: 0.98) and lower during the 
remaining months, with no particular seasonal pattern (accuracy: 
0.8-1.0 with mean accuracy 0.91, Extended Data Fig. 9b). Reduced 
accuracy occurred during months when river ice was present and was 
attributed to: (1) complicated reflectance returns due to dynamic tran- 
sition between ice and water; (2) increased turbidity accompanying ice 
break-up; (3) the difference in scale between the Landsat-derived ice 
condition (averaging across a1,500-m radius) and the in situ records 
(scale unknown, see examples in Extended Data Fig. 9c); and (4) errors 
intheinsitu records. Notably, the accuracy derived fromthe observation- 
based NWS ice conditions (overall accuracy: 0.97) was generally higher 
than that fromthe WSC (overall accuracy: 0.94) (see also Extended Data 
Fig. 9b). The fact that the ice condition from the WSC was inferred, instead 
of observed, could have contributed to this discrepancy. 

Comparison with in situ river ice records also showed no system- 
atic differences among Landsat sensors. Accuracy was similar across 
data from Landsat 5 (TM), Landsat 7 (ETM+) and Landsat 8 (OLI) (see 
Extended Data Table 1). It is worth noting that Landsat 8 has an extra 
flag for cirrus clouds, which could potentially improve the quality of 
ice data by better excluding cloud-affected observations. However, 
we decided not to use this flag, as its inclusion could potentially cause 
varying data quality between sensors, which then could compromise 
the detection of historical river ice change. 


Human influence on river ice. Human activities that affect rivers—such 
as river engineering and water pollution—tend to systematically and 
permanently alter the river morphology, as well as the thermal and 
physical properties of the flow. River ice regimes affected by these influ- 
ences cannot be explained by the changes in SAT alone. In one previous 
study, human activity was found to affect the river ice regime to amuch 
greater degree than climate variation along two highly regulated river 
reaches in Europe®. While we acknowledge the contribution of these 
non-climatic factors, quantification of their effects globally exceeds the 
scope of this study. Nonetheless, interpretation of our results in rivers/ 
regions that are known to be heavily engineered requires extra caution. 

Although direct anthropogenic influence on river ice regimes should 
be considered when interpreting both in situ and remotely sensed 
data, interpreting remotely sensed data requires extra consideration 


of the sensitivity of the classification algorithms to anthropogenic 
influence. Otherwise, there is a risk of falsely attributing changes in 
river ice to changing climate. For example, for the lower Yellow River, 
our detection of great historical river ice decline is likely to be largely 
due tothe combined effect of changes in water turbidity—mostly owing 
to recent damming upstream—and the tendency for Fmask to falsely 
classify turbid water as snow/ice. 


Uncertainties in ERA5 SAT. Because it was released very recently, there 
is no spatially comprehensive evaluation of SAT in ERAS, so its overall 
accuracy remains unknown. However, from studies that evaluated 
this parameter regionally, ERAS has outperformed other reanalysis 
datasets and can accurately represent the magnitude and variability 
of near-surface air temperature over Antarctica**. 


Spatial scale mismatch between temperature dataset and river size. 
Whenattaching the ERAS temperature data (spatial resolution of approxi- 
mately 30 km) to our riverice extent dataset and modelling river ice based on 
themerged dataset, as well as predicting future river ice extent with temper- 
ature data from NEX-GDDP (spatial resolution 0.25°), weimplicitly assumed 
that temperature fora grid cellis representative of that experienced bythe 
river init. This assumption could result in bias when mixing temperatures 
fromland and water pixels, especially when large topographic variability ex- 
istsinthegrid cell, as rivers tendto flowalong topographic lowareas, and el- 
evation greatly affects temperature. The degree of this inherent systematic 
bias may be reduced in the future with the development of more advanced 
reanalysis datasets. 


Data availability 


The global river ice dataset can be accessed at https://doi.org/10.5281/ 
zenodo.3372709. The in situ and Landsat-derived river ice records for 


evaluating ice classification can be accessed at https://doi.org/10.5281/ 
zenodo.3372754. 
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accessed online at the project’s GitHub page (https://github.com/ 
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and all figures in the paper (including the extended data figures) were 
made using R statistical software (http://www.R-project.org/). 
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Extended Data Fig. 2| Monthly maps of the changes in river ice extent between 1984-1994 and 2008-2018. Black indicates no data or no studied river. 
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Extended Data Table 1| Fmask-derived river ice evaluation across Landsat missions 
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To address global challenges’ *, 193 countries have committed to the 17 United 


Nations Sustainable Development Goals (SDGs)°. Quantifying progress towards 
achieving the SDGs is essential to track global efforts towards sustainable 
development and guide policy development and implementation. However, 
systematic methods for assessing spatio-temporal progress towards achieving 

the SDGs are lacking. Here we develop and test systematic methods to quantify 
progress towards the 17 SDGs at national and subnational levels in China. Our analyses 
indicate that China’s SDG Index score (an aggregate score representing the overall 
performance towards achieving all 17 SDGs) increased at the national level from 
2000 to 2015. Every province also increased its SDG Index score over this period. 
There were large spatio-temporal variations across regions. For example, eastern 
China had a higher SDG Index score than western China in the 2000s, and southern 
China had a higher SDG Index score than northern China in 2015. At the national level, 
the scores of 13 of the 17 SDGs improved over time, but the scores of four SDGs 
declined. This study suggests the need to track the spatio-temporal dynamics of 
progress towards SDGs at the global level and in other nations. 


To achieve these ambitious SDGs, the world needs to monitor pro- 
gress towards all 17 SDGs by assessing past and current conditions 
at national and subnational levels®. However, no study has explored 
the spatio-temporal dynamics of progress towards the SDGs at both 
national and subnational levels. Such information is urgently needed, 
as many countries face the challenge of achieving sustainability in 
times of growing population, uneven development across regions 
within their borders and resource scarcity under rapidly developing 
economies. A spatio-temporal analysis of sustainable development can 
help countries to identify hotspot regions for targeted policy action 
and for tracking progress towards achieving the SDGs. Understand- 
ing the differences in sustainable development between developed and 
developing regions over time can help a nation to balance sustainable 
development across its regions. 

In this study, we developed systematic methods to quantify the SDGs 
and provided a demonstration of quantification by performing acom- 
prehensive spatio-temporal analysis of progress towards all17 SDGsin 
China, the largest developing country both in areal extent and popu- 
lation. Over the past several decades, China has experienced rapid 
economic development, reflected in its exceptional growth in gross 
domestic product (GDP)’ and becoming the world’s second-largest 
economy. However, China also faces large socioeconomic challenges 
suchas income and gender inequality’, and environmental challenges 


such as water scarcity and pollution, energy shortages, and air and 
soil pollution’. These socioeconomic and environmental challenges 
within China vary substantially from region to region and have changed 
noticeably over time’. China is trying to achieve sustainability under 
complex environmental and socioeconomic challenges and policies”. 
To promote sustainable development, China has implemented a variety 
of policies suchas the ‘Western Development Strategy’ and the ‘Natural 
Forest Conservation Program” ”, 

We tracked China’s progress towards achieving the SDGs at the 
national and subnational (provincial) levels by quantifying (scoring) 
the SDGs over time (see details in the Methods). We addressed four 
major questions. First, how has sustainable development in China, as 
measured in terms of the SDGs, evolved at the national level? Second, 
how has sustainable development varied across China’s provinces over 
time? Third, how have differences in sustainable development between 
more-developed and less-developed provinces in China evolved over 
time? Fourth, how has progress varied among the different SDGs? 

To answer these questions, we used annual time series data relevant 
to the 17 SDGs from 2000 to 2015 at the national level and calculated 
the SDG Index score (O-100)", which consists of individual scores for 
the 17 SDGs and represents China’s overall performance in achieving 
all 17 SDGs“ (see details in the Methods). In total, 119 SDG indicators 
were used in this assessment (see data sources and indicator sources 
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Fig. 1| Change in China’s SDG Index score and individual SDGscores. a, SDG 
Index score. b, Scores of selected SDGs (2, 6, 9, 15 and 17) at the national level 
from 2000 to 2015S. For datasources, see Methods. 


in Supplementary Table 1). We detected spatio-temporal changes 
in SDG Index scores across China’s provinces based on data for the 
17 SDGs at the provincial level in 2000, 2005, 2010 and 2015. We then 
compared the change in SDG Index scores over time between developed 
and developing provinces (determined by each province’s average GDP 
per capita during 2000-2015; see details in the Methods) during the 
same period. Finally, by comparing scores for the individual SDGs we 
examined the relative progress toward achieving the different SDGs. 


Results 


Our results indicate that China has improved its SDG Index score at the 
national level over time (Fig. 1; Extended Data Fig. 1). Its national SDG 
Index score increased by approximately 21.9%, froma score of 45.5 in 
2000 to 55.4 in 2015. 

Notably, at the provincial level, eastern China had a higher SDG Index 
score than western China in the 2000s, while southern China had a 
higher SDG Index score than northern China in 2015, suggesting that 
substantial changes in sustainable development occurred across dif- 
ferent regions (Fig. 2; see Supplementary Tables 2,3). SDG Index scores 
at the provincial level ranged from 31.4 to 54.1 with a mean value of 42.2 
in 2000, from 38.1 to 57.6 with a mean value of 45.2 in 2005, from 42.5 
to 63.9 with a mean value of 49.8 in 2010, and from 47.0 to 66.1 with 
a mean value of 54.9 in 2015, reflecting a 30.0% increase in the mean 
value of the SDG Index score across provinces over time. The change 
in SDG Index score among provinces from 2000 to 2015 ranged from 
a11.1% increase (Shanghai) to a 51.8% increase (Ningxia). 


All provinces increased their SDG Index scores from 2000 to 2015 
(Fig. 2; Supplementary Table 3). Developed provinces had higher SDG 
Index scores than developing provinces throughout our study period 
(Fig. 3; Supplementary Table 4). However, developing provinces experi- 
enced a greater growth rate in their average SDG Index scores than did 
developed provinces. These dynamics were also observed between the 
top five developed provinces and the bottom five developing provinces 
(Fig. 3; see details in the Methods). 

At the national level, the scores of 13 of the 17 SDGs improved, while 
the scores of the remaining four SDGs decreased over time (Fig. 4). The 
four SDGs with declining scores, in order of greatest to least decline, 
were SDG 14 (life below water), SDG 12 (responsible consumption and 
production), SDG 5 (achieve gender equality) and SDG 13 (climate 
action) (Fig. 4). The three SDGs that improved the most, in order of 
greatest to least improvement, were SDG 9 (industry, innovation and 
infrastructure), SDG 10 (reduced inequalities), and SDG 17 (afford- 
able and clean energy) . Generally, the changes in SDG scores at the 
provincial level showed similar dynamics as those at the national level 
(Supplementary Table 5). In terms of absolute SDG score, the bottom 
five SDGs, which lagged behind the other SDGs at the national level 
in 2015, included SDGs 15 (life on land), 14 (life below water), 17 (part- 
nerships for the goals), 8 (decent work and economic growth) and 10 
(reduced inequalities); see Supplementary Table 3. 


Discussion 


The spatio-temporal patterns of China’s SDG Index scores may result 
from a number of factors, including the implementation of policies 
that have different regional impacts, geographical conditions, cli- 
mate and infrastructure’. At the national level, factors such as 
governmental support for sustainability and investment in science 
and technology can strongly promote progress in national sustainable 
development (Supplementary Discussion). For the Chinese reform and 
opening-up policies that began in the late 1970s and early 1980s, the 
Chinese government focused on facilitating economic development 
more in eastern coastal regions than in inland regions, resulting in 
more advanced social services such as education and healthcare in 
eastern China’. Eastern China’s relatively flat topography and favour- 
able climate also make it more conducive for human habitation, as 
well as industrial and agricultural development’®. Conversely, western 
China’s rugged topography", combined with its distance from the 
coast, complicates transportation within the region and to and from 
other regions. As aresult, in 2000, western China experienced limited 
urbanization and socioeconomic development and had the lowest 
industrialization level and highest poverty rate in China’®. Western 
China’s ecological assets have also historically limited its development 
(Supplementary Discussion). To alleviate this regional disparity, the 
Chinese government implemented the Western Development Strategy 
in1999 to improve environmental and socioeconomic conditions in 
western China”. In 1999, only 29% of the Chinese government’s fiscal 
transfers were allocated to western China, but this reached 39.4% in 
2010. Under the Western Development Strategy, both infrastruc- 
ture development and ecological conservation in western China have 
greatly improved” (Supplementary Discussion). Meanwhile, after 
2010 the growth rate of progress towards sustainable development 
(SDG Index score) in northeastern China fell behind other regions 
in socioeconomic development and environmental conservation 
because of low efficiency in resource use, unsustainable economic 
development and severe environmental pollution (Supplementary 
Discussion). Developed provinces experienced smaller increases in 
the SDG Index score than developing provinces mainly because they 
face problems associated with rapidly growing economies, such asa 
tendency for socioeconomic and gender inequality’® to increase, as 
well as intensive resource consumption and severe environmental 
pollution (Supplementary Discussion). 
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Fig. 2| Spatial pattern of SDG Index scores in 2000, 2005, 2010 and 2015 for 
31 Chinese provinces. a, 2000. b, 2005. c, 2010. d, 2015. The data for the base available. 
map was derived from the Resource and Environment Data Cloud Platform” 
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China’s rapid technological advances, improved social services such 
as education and healthcare, and environmental conservation policies 
have all enhanced sustainability°"™?°. However, environmental 
problems such as water pollution and scarcity and land degradation 
still pose a great threat to China’s sustainability because these burdens 
are often associated with other environmental problems such as bio- 
diversity loss and severe droughts. Moreover, China’s social problems, 
such as inequality, can be linked to other complex social problems 
(such as mental illness, violence, obesity, imprisonment, homicide, 
teen pregnancy, drug abuse and poor academic performance)” that 
make sustainability difficult to achieve. The Chinese government could 
therefore prioritize the SDGs that lag behind other SDGs, suchas SDG 
14. and SDG15, while facilitating holistic sustainability through inte- 
grated policy action (Supplementary Discussion). In particular, for 
these SDGs more effective policies aimed at protecting life in water 
and on land are required. China can build on previous successes to deal 
with regional discrepancies. For example, policymakers could consider 
more strategies to promote development in northern China in order 
to reduce the gap in sustainable development between northern and 
southern China. Since the gap in sustainable development between 
western and eastern China has shrunk since the Western Development 
Strategy was implemented, lessons learned from the Western Develop- 
ment Strategy may help to close the gap in sustainable development 
between northern and southern China. 
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indicates an increase in the score from 2000 to 2015, while a negative value 
(red) indicates a decrease in the score from 2000 to 2015. For datasources, 
see Methods. 


Future research could focus on the spillover effects of one region’s 
actions on the sustainable development of other regions within 
China as well as on spillover effects across national borders” (Sup- 
plementary Discussion). Furthermore, exploring trade-offs and syn- 
ergies between SDGs can help to reveal the complex mechanisms 
and consequences of sustainable development”’. Research assess- 
ing the complex impacts of policies on sustainable development is 
also needed. 

This study provides a temporal sustainability assessment of all 17 
SDGsat national and subnational levels. China has mandated the moni- 
toring of the progress toward the SDGs™, but it has not developed sys- 
tematic and comprehensive evaluation methods. Thus, the methods 
outlined in our paper are of value to China’s monitoring efforts. Our 
approach might also lay a foundation for analysing spatio-temporal 
patterns of SDG progress for other countries and across local to global 
levels. 
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Methods 


Six interrelated steps for calculating and comparing SDG scores 
Step 1: indicator selection and data sources. We selected indica- 
tors from a combination of the United Nations’ official list of global 
Sustainable Development Goal indicators”, the 2018 SDG Index and 
Dashboards Report” anda report of the United Nations titled “Indicators 
anda Monitoring Framework for the Sustainable Development Goals”. 
The 2018 SDG Index and Dashboards Report and the Monitoring Frame- 
work Report were published by the Sustainable Development Solutions 
Network, which operates under the auspices of the United Nations to 
promote the implementation of the SDGs and the Paris Climate Agree- 
ment. The 2018 SDG Index and Dashboards Report provides a robust, 
quantitative and transparent method of measuring SDG baselines at the 
country level that has been used ina subsequent peer-reviewed paper®. 
Inaddition to the above indicators, we also constructed additional indi- 
cators based on our understanding of the SDG targets. 

For each SDG, we chose as many SDG indicators as was feasible from 
the list of recommended indicators, based on data availability both at 
the provincial and national levels and the availability of the indicators 
across organizational levels and temporal scales (see Supplementary 
Methods for anexample of indicator selection for SDG 6). This approach 
follows that of previous studies”””®. Our list of indicators included a 
total of 119 SDG indicators at both the national level and provincial 
level over time, which is greater than the number of indicators in the 
2018 SDG Index and Dashboards Report (which used 88 indicators to 
assess China’s SDGs performances for a single year). 

Data for the selected indicators in this study were obtained from 
the following authoritative sources: the National Bureau of Statistics 
of the People’s Republic of China, the China Statistical Yearbook”’, the 
Finance Yearbook of China*’, the China Statistical Yearbook on the 
Environment”, the Educational Statistics Yearbook of China”, the China 
Health Statistics Yearbook”, the China Energy Statistical Yearbook** 
and the China Population Statistics Yearbook*. See Supplementary 
Table 1 fora list of SDGs and their corresponding indicators and the data 
sources used in this paper. 


Step 2: bound selection. To ensure comparability across different 
SDGs, the indicator values for each SDG were normalized toa standard 
scale ranging from 0 (worst-performing indicator value towards achiev- 
ing SDGs, or worst performance) to 100 (best-performing indicator 
value towards achieving SDGs, or best performance). ‘Performance’ 
refers to the progress of a nation or subnational unit towards achieving a 
single SDG or all 17 SDGs as a whole, measured in terms of SDG indicator 
values. A higher normalized SDG score indicates better performance 
towards achieving an SDG. For the national level analysis, we pooled 
the annual values for 2000-2015 for the selected indicator metrics of 
each SDG. Thus, the data for each SDG indicator includes 16 indicator 
values (one per year) that reflect the temporal dynamics of China’s 
overall performance towards that SDG indicator. At the provincial level, 
we pooled, again separately for each SDG indicator, the values of the 
indicator metric for the 31 provinces for four years (2000, 2005, 2010 
and 2015). In this case, the data reflect the temporal dynamics for each 
province towards meeting the individual SDGs. 

We followed the methods proposed by the 2018 SDG Index and Dash- 
boards Report" to normalize the national and provincial data arrays 
for each SDG indicator. These methods of establishing an upper anda 
lower bound minimize the potential effects of skewed data because they 
offset the effects of extreme values on both tails of the data distribution. 

Similarly, we identified upper and lower bounds for each SDG indica- 
tor in order to minimize the potential effects of skewed data distribu- 
tions on the standardized values during normalization. Our method 
for setting the upper boundis similar to the approach used in the 2018 
SDG Index and Dashboards reportin order to make it easier to compare 
China with other countries. The upper bound for each indicator was 


determined using a five-step decision tree. If the condition for an earlier 
step is met, then all of the later steps are skipped. First, for all indicators 
that are also used in the 2018 SDG Index and Dashboards report, we 
adopted the bound used in the 2018 SDG Index and Dashboards report. 
Second, we used relevant absolute quantitative thresholds for SDGs 
and targets, suchas ‘no poverty’ and ‘absolute gender equality’. Third, 
if no explicit SDG target was stated, we adopted the principle of ‘leave 
no one behind’ to determine the upper bound of zero deprivation or 
universal access for the following types of indicators: (1) public service 
coverage, and disease and pollution control, (2) measures of ending 
hunger (consistent with the SDG purpose to remove extreme hunger in 
all forms), and (3) access to basic infrastructure (for example, mobile 
phone coverage). Fourth, where they exist, we used science-based 
targets set for 2030 or later. Fifth, we set the upper bound for all other 
indicators equal to the average of the top five performers across the 
provincial and national levels together. 

In terms of lower bound, for all indicators that were used in the 2018 
SDG Index and Dashboards report, we adopted the lower bound used 
in the 2018 SDG Index and Dashboards report. For other indicators, 
the lower bound was defined as the SDG indicator value (one data 
point) located close to the value of the bottom 2.5th-percentile per- 
former (across all provinces over four time steps (2000, 2005, 2010 
and 2015) and entire China over time (2000-2015 annually)) of the 
sorted arrays, which was also similar to criteria in the 2018 SDG Index 
and Dashboard report for selecting the lower bound”. If the place 
of the bottom 2.5th percentile was located between two consecu- 
tive integers, the larger or smaller interger was used as the place for 
the lower bound when a larger indicator data value represented better 
or worse performance. We specified ‘top-performing SDG indicator 
values’ and ‘bottom-performing SDG indicator values’ rather than 
referring to the data points as simply high or low values, because alow 
value may represent high performance in some SDGs (for example, 
zero poverty) but poor performance in others (for example, amount 
of protected areas). 


Step 3: normalization of indicator values. After establishing the lower 
and upper bound for each indicator, we used the following formula to 
normalize SDG indicator values towards meeting a SDG target at the 
national and provincial levels ona scale of 0 to 100 (ref. “): 


_  x- minx) 
~ max(x) — min(x) 


7 


x100 


where xis the original data value of each SDG indicator, max/min rep- 
resents the upper/lower bounds for the best/worst performance, and 
x’ is the normalized individual score for a given SDG indicator. All nor- 
malized values greater than the upper bound received a score of 100, 
and all normalized values less than the lower bound received a score of 
0. Values between the upper and lower bounds were distributed along 
the spectrum from the worst performance (score 0) to the best perfor- 
mance (score 100). A province with a score of 50 is halfway towards 
achieving the best performance. The normalized scores can be used 
to evaluate relative performance over time and space towards achiev- 
ing the SDGs. For example, if for a particular SDG indicator a province 
lagged behind all other provinces in both 2000 and 2015 but improved 
over time, its score for that SDG indicator in 2015 would be greater than 
its score in 2000, but in both years, its score would be lower than that 
of the other provinces. We normalized the data across provincial and 
national levels together, so that the SDG scores are comparable across 
China and its provinces. 


Step 4: calculation of SDG Index scores. We calculated SDG Index 
scores at the national and provincial levels using arithmetic means, 
following the approach used in the 2018 SDG Index and Dashboards 
Report". This is an aggregate score that consists of individual scores 
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for all17 SDGs and represents China’s overall performance in achieving 
all17 SDGs over time“. All SDGs were weighted equally inthe SDG Index 
score to convey the importance of integrated solutions that equally 
address all 17 SDGs™. Consistent with previous research®™, there is no 
a priori reason to give one measure greater weight than another. The 
equal weighting is also consistent with the spirit that all countries need 
to achieve all 17 SDGs through integrated strategies*“. Within each SDG 
each indicator is equally weighted, which means that every indicator is 
weighted inversely to the number of indicators available for that SDG™. 


Step 5: calculation of SDG Index scores and individual SDG score 
over time and between organization levels. At the national level, we 
aggregated China’s 17 SDG scores into one national SDG Index score 
for each year from 2000 to 2015, yielding 16 SDG Index scores. At the 
provincial level, we aggregated each province’s 17 SDG scores for 2000, 
2005, 2010 and 2015, separately, yielding four SDG Index scores per 
province. In addition, we calculated the change in SDG scores separately 
for each of the 17 individual SDG scores and for China and its provinces, 
by subtracting the normalized score in 2000 from the score in 2015. 
The SDGs with the bottom five scores in 2015 were considered to be 
the bottom five SDGs, lagging behind other SDGs. 


Step 6: comparison of SDG Index scores between developing and 
developed regions. Ten developing provinces and ten developed 
provinces in China were selected to compare SDG Index scores between 
relatively more- and less-developed regions, based on each province's 
average GDP per capita from 2000 to 2015**. Provinces with the highest 
ten GDP values per capita were considered to be developed provinces, 
whereas provinces with the lowest ten GDP values per capita were con- 
sidered to be developing provinces. We also designated provinces 
with the highest five GDP values as the top five developed provinces 
and provinces with the lowest five GDPs as the bottom five develop- 
ing provinces. Finally, we compared the average SDG Index scores, 
calculated across all SDGs, between developed and developing 
provinces. 


Uncertainty and sensitivity analysis for SDG scores 
To explore the uncertainty introduced by the number of SDG indicators, 
we ran uncertainty analyses. For each SDG, we analysed all possible 
combinations of SDG indicators for all possible numbers of SDG indica- 
tors, which yielded a distribution of SDG scores for Chinain 2015. This 
allowed us to determine the impact of different numbers of indicators 
and different combinations of indicators on the SDG score. We found 
that as the number of indicators increased, the uncertainty (variation) in 
the SDGscore decreased. When the number of indicators per SDGis two 
or larger, the median SDG score was almost constant (Extended Data 
Fig. 2). We performed an uncertainty analysis for SDG 9 as an example 
using all combinations of SDG indicators, under all possible numbers of 
SDG indicators. Given that the total number of indicators for SDG 9 is 
14, the possible number of indicators to be selected for an uncertainty 
analysis ranges from 1, 2,...to14. The number of possible combinations 
of indicators can be calculated based on the theory of combinations. 
When we choose m indicators froma total of nindicators, the number 
of possible combinations is: 


tee ni 
nm (n-m)! 


For example, when selecting one indicator, there are only 14 possible 
combinations (that is, 1, 2, 3,..., 14). 

When we choose 2 indicators from 14 indicators, the number of pos- 
sible combinations is 


1x2x...x12 x13 x14 . 
(1x2) x (1x2... x10 11x12) 


Cie 91 


When selecting 3-13 indicators, the numbers of combinations are 
364, 1,001, 2,002, 3,003, 3,432, 3,003, 2,002, 1,001, 364, 91and 14, 
respectively. When selecting all 14 indicators for analysis, there is only 
one combination. 

Next we calculated the scores of SDG 9 for all these combinations of 
SDG indicators under different possible numbers of selected indica- 
tors. We obtained the distribution of SDG 9 scores for Chinain 2015 to 
determine the effect of the number of indicators under all potential 
combinations of indicators on the SDG score. We found that as the 
number of indicators for SDG 9 increased, the uncertainty (variation) 
decreased. When the number of indicators for SDG 9 was two or larger, 
the median SDG score remained almost constant (Extended Data Fig. 2). 

Wealsorana sensitivity analysis” to assess the sensitivity of the SDG 
scores to different values of variables that affect the SDG scores. We 
employed a widely used sensitivity index to measure the degree of 
sensitivity’®: 5, = (AX/X)/(AP/P) where X is the SDG score under the 
original condition for a performer of interest, AX is the difference of 
the SDGscore for the performer of interest (for example, one province 
ina specific year) between the original and modified conditions due 
to changes in the performer’s data value of a certain SDG indicator. P 
represents the value of an SDG indicator of the performer of interest 
under the original condition and APis the difference in the data value of 
the SDG indicator of the performer between the original and modified 
conditions. S, refers to the change in the SDG score of the performer 
due to the change in the data value of the SDG indicator. We decreased 
and increased (separately) the value for each indicator by 10% for China 
at the national level as well as for three randomly chosen provinces 
(Beijing, Henan and Gansu) from provinces at three sustainable devel- 
opment levels (average SDG Index scores in years 2000, 2005, 2010 and 
2015: Ist to 10th-highest as high level, 11th to 20th as middle level, 21st 
to 31st as low level) as examples and recalculated their SDG score and 
obtained the sensitivity index S,. We found that the sensitivity of SDG 
scores to changes in an indicator’s data value is very small (less than 
0.2) (Extended Data Fig. 3). 

To assess where China stands relative to the rest of the world, we 
recalculated China’s SDG Index score using the indicators that over- 
lapped between our paper and the 2018 SDG Index and Dashboards 
report. China’s SDG Index score over time relative to the rest of world 
in one year is shown (Extended Data Fig. 4). 

To examine the spatio-temporal heterogeneity of SDGs at the pro- 
vincial level, we calculated the coefficient of variation for each SDG 
score across provinces over time (Extended Data Fig. 5). 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


All data are available from the corresponding authors upon reasonable 
request. Data that support the findings of this study are available within 
the paper and its Supplementary Information. 
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Extended Data Fig. 1| Change in China’ s individual SDG scores at the national level from 2000 to 2015. For datasources, see Methods. 
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Extended Data Fig. 2| Uncertainty analysis for SDG scores (n=281,287) at 
the national level in 2015 for different numbers of selected indicators. 1-17 
indicates uncertainty analysis for SDG1-17. Sample sizes are 63, 1,023, 262,143, 
1,023, 63, 63, 7,15, 16,383, 15, 63, 127, 31, 7,127, 7 and 127 for box plots of SDG 
1-17. In each box plot, the central rectangle spans the first quartile Ql to the 
third quartile Q3, whichis the interquartile range (IQR)*°! (IQR = Q3 to QI), 
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while the line segment inside the rectangle shows the median. When the 
maximum observed SDGscores are greater than Q3+1.5 x IQR**"!, the upper 
whisker (red) is Q3 + 1.5 x IQR*°", Otherwise, the upper whisker is the maximum 
observed SDG score. When the minimum observed SDG scores are less than 
Q1-1.5xIQR*, the lower whisker (green) is Q1-1.5 x IQR. Otherwise, the 
lower whisker is the minimum observed SDG score 
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Extended Data Fig. 3| Sensitivity of SDGscorestochangesineachindicator. 2015. Thesamplesizen for each figure is 119 indicators. Thex axes display the 


The sensitivity index S, of SDG scores is shown when each SDG indicator’s SDG indicators arranged from1to 119. The yaxis is the sensitivity index S, of 
original data value decreased by 10%, (1)—(16), or increased by 10%, (17)-(32), SDG scores due tothe 10% decrease or increase in the original value of each 
for China and for three example provinces (Beijing, Henan and Gansu) at three indicator. 
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Extended Data Fig. 5 | Coefficient of variation for SDG scores. a, Coefficient of variation (CV) for SDG scores of provinces in 2000, 2005, 2010 and 2015. 
b, Average value of the coefficient of variation for SDG scores at the provincial level in 2000, 2005, 2010 and 2015. 
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Water lilies belong to the angiosperm order Nymphaeales. Amborellales, 
Nymphaeales and Austrobaileyales together form the so-called ANA-grade of 
angiosperms, which are extant representatives of lineages that diverged the earliest 
from the lineage leading to the extant mesangiosperms’ *. Here we report the 
409-megabase genome sequence of the blue-petal water lily (Vymphaea colorata). 
Our phylogenomic analyses support Amborellales and Nymphaeales as successive 
sister lineages to all other extant angiosperms. The N. colorata genome and 19 other 
water lily transcriptomes reveal a Nymphaealean whole-genome duplication event, 
which is shared by Nymphaeaceae and possibly Cabombaceae. Among the genes 
retained from this whole-genome duplication are homologues of genes that regulate 
flowering transition and flower development. The broad expression of homologues of 


floral ABCE genes inN. colorata might support a similarly broadly active ancestral 
ABCE model of floral organ determination in early angiosperms. Water lilies have 
evolved attractive floral scents and colours, which are features shared with 
mesangiosperms, and we identified their putative biosynthetic genes in N. colorata. 
The chemical compounds and biosynthetic genes behind floral scents suggest that 
they have evolved in parallel to those in mesangiosperms. Because of its unique 
phylogenetic position, the NV. colorata genome sheds light on the early evolution of 


angiosperms. 


Many water lily species, particularly from Nymphaea (Nymphaeaceae), 
have large and showy flowers and belong to the angiosperms (also 
called flowering plants). Their aesthetic beauty has captivated nota- 
ble artists such as the French impressionist Claude Monet. Water lily 
flowers have limited differentiation in perianths (outer floral organs), 
but they possess both male and female organs and have diverse scents 
and colours, similar to many mesangiosperms (core angiosperms, 
including eudicots, monocots, and magnoliids) (Supplementary 
Note 1). Inaddition, some water lilies have short life cycles and enormous 
numbers of seeds‘, which increase their potential as amodel plant to rep- 
resent the ANA-grade of angiosperms and to study early evolutionary 
events within the angiosperms. In particular, N. colorata Peter has a 
relatively small genome size (2n = 28 and approximately 400 Mb) and 
blue petals that make it popular in breeding programs (Supplementary 
Note 1). 

We report here the genome sequence of N. colorata, obtained using 
PacBio RSII single-molecule real-time (SMRT) sequencing technol- 
ogy. The genome was assembled into 1,429 contigs (with a contig NSO 
of 2.1Mb) and total length of 409 Mb with 804 scaffolds, 770 of which 


were anchored onto 14 pseudo-chromosomes (Extended Data Fig. 1 
and Extended Data Table 1). Genome completeness was estimated to 
be 94.4% (Supplementary Note 2). We annotated 31,580 protein-coding 
genes and predicted repetitive elements with a collective length of 
160.4 Mb, accounting for 39.2% of the genome (Supplementary Note 3). 

The N. colorata genome provides an opportunity to resolve the 
relationships between Amborellales, Nymphaeales and all other extant 
angiosperms (Fig. 1a). Using six eudicots, six monocots, N. colorata and 
Amborella’, and each of three gymnosperm species (Ginkgo biloba, 
Picea abies and Pinus taeda) as an outgroup in turn, we identified 2,169, 
1,535 and 1,515 orthologous low-copy nuclear (LCN) genes, respec- 
tively (Fig. 1b). Among the LCN gene trees inferred from nucleotide 
sequences using G. biloba as an outgroup, 62% (294 out of 475 trees) 
place Amborella as the sister lineage to all other extant angiosperms 
with bootstrap support greater than 80% (type Il, Fig. 1c). Using P. abies 
or P. taeda as the outgroup, Amborella is placed as the sister lineage 
to the remaining angiosperms in 57% and 54% of the LCN gene trees, 
respectively. LCN gene trees inferred using amino acid sequences show 
similar phylogenetic patterns (Supplementary Note 4.1). 


Alist of affiliations appears at the end of the paper. 
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Fig. 1| Phylogenomic relationships of angiosperms. a, Three different 


evolutionary relationships among major clades of angiosperms. b, Number of 


LCN genetrees with different bootstrap support (BS) values based on 


nucleotide sequences from six eudicots, six monocots, N. colorata, Amborella 
and three different gymnosperms. c, Comparison of gene trees supporting the 


three evolutionary relationships using each gymnosperm inturnas the 


To minimize the potential shortcomings of sparse taxon sampling®, 
we also inferred an angiosperm species tree using sequences from 
44 genomes and 71 transcriptomes, including representatives of the 
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Nymphaeales usedind. 


ANA-grade, eudicots, magnoliids, monocots and a gymnosperm out- 
group (Gnetum montanum, G. biloba, P. abies and P. taeda) (Methods). 
For further phylogenetic inference of these 115 species, we selected, 
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Fig.2|A Nymphaealean WGD shared by Nymphaeaceae and possibly 
Cabombaceae. a, K, age distributions for paralogues found in collinear regions 
(anchor pairs) of N. colorata and for orthologues between N. colorata and 
selected Nymphaealean and angiosperm species. Red and yellow arrows 
indicate under- and overestimations of the N. colorata-Nuphar advena and 
N.colorata-C. caroliniana divergence, respectively. b, WGD phylogenomic 
analysis. Numbers in parentheses are the number of gene families with retained 
C. caroliniana duplicates supporting the duplication events. Numbers below 
branches show branch lengths in K; units. The double-arrowed line denotes 
total K, from the pointed node toN. colorata. We used G. biloba (dashed branch) 
as an outgroup. The red dot denotes the branch on which most of the anchor 


based on various criteria, five different LCN gene sets including 1,167, 
834, 683, 602 and 445 genes. Analyses of these five datasets all yielded 
similar tree topologies with Amborella and Nymphaeales as successive 
sister lineages to all other extant angiosperms (Fig. 1d, e, Supplemen- 
tary Note 4.2). 

Molecular dating of angiosperm lineages, using astringent set of 101 
LCN genes and with age calibrations based on 21 fossils’, inferred the 
crown age of angiosperms at 234-263 million years ago (Ma) (Fig. 1d). 
The split between monocots and eudicots was estimated at 171-203 Ma 
and that between Nymphaeaceae and Cabombaceae at 147-185 Ma. 

Genomic collinearity unveiled evidence of a whole-genome dupli- 
cation (WGD) event in N. colorata (Extended Data Figs. 1f, 2a and Sup- 
plementary Note 5.1). The number of synonymous substitutions per 
synonymous site (K,) distributions for N. colorata paralogues further 
showed a signature peak at K, of approximately 0.9 (Fig. 2a) and peaks at 
similar K, values were identified in other Nymphaeaceae species (Sup- 
plementary Note 5.2), which suggests an ancient single WGD event that 
is probably shared among Nymphaeaceae members. Comparison of 
the N. colorata paralogue K, distribution with K, distributions of ortho- 
logues (representing speciation events) between N. colorata and other 
Nymphaeales lineages, /llictum henryi, and Amborella suggests that 
the WGD occurred just after the divergence between Nymphaeaceae 
and Cabombaceae (Fig. 2a). By contrast, phylogenomic analyses of 
gene families that contained at least one paralogue pair from collinear 
regions of N. colorata suggest that the WGD is shared between Nym- 
phaeaceae and Cabombaceae (Fig. 2b, Supplementary Note 5.4). Iftrue, 
Cabomba caroliniana seems to have retained few duplicates (Fig. 2b, 
c), which would explain the absence of a clear peak inthe C. caroliniana 
paralogue kK, distribution (Supplementary Note 5.2). Absolute dating 
of the paralogues of N. colorata does suggest that the WGD could have 
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occurred before or close to the divergence between Nymphaeaceae 
and Cabombaceae (Extended Data Fig. 2d, Supplementary Note 5.3), 
considering the variable substitution rates among Nymphaealean line- 
ages (Fig. 2a, b, Extended Data Fig. 2c). An alternative interpretation 
of the above results could be that the WGD signatures were from an 
allopolyploidy event that occurred between ancestral Nymphaeaceae 
and Cabombaceae lineages shortly after their divergence and that 
gave rise to the Nymphaeaceae (but not Cabombaceae) stem lineage 
(Fig. 2d, Supplementary Note 5.4). 

The water lily lineage descended from one of the early divergences 
among angiosperms, before the radiation of mesangiosperms. Thus, 
this group offers a unique window into the early evolution of angio- 
sperms, particularly that of the flower. We identified 70 MADS-box 
genes, including homologues of the genes for the ABCE model of floral 
organ identities: API (and also FUL) and AGL6 (A function for sepals 
and petals), AP3 and PI (B function for petals and stamen), AG (C func- 
tion for stamen and carpel), and SEP1 (E function for interacting with 
ABC function proteins). Phylogenetic and collinearity analyses of the 
MADS-box genes and their genomic neighbourhood indicate that an 
ancient tandem duplication before the divergence of seed plants gave 
birth to the ancestors of A function (FUL) and E function genes (SEP) 
(Extended Data Fig. 3, Supplementary Note 6.1). Also, owing tothe Nym- 
phaealean WGD, N. colorata has two paralogues, AGa and AGb of the 
C-function gene AG (Extended Data Fig. 4). Similarly, the Nymphaealean 
WGD-derived duplicates are homologous to other genes associated 
with development of carpel and stamen’, and to genes that regulate 
flowering time’ and auxin-controlled circadian opening and closure of 
the flower” (Extended Data Figs. 4-6, Supplementary Note 6.2-6.4). 

The expression profiles of N. colorata ABCE homologues largely agree 
with their putative ascribed roles in floral organ patterning (Fig. 3a). 
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Fig.3| MADS-box genes in N. colorata and proposed floral ABCE model in 
early angiosperms. a, Gene expression patterns of MIKC‘ from various organs 
of N. colorata. Three clusters of genes were classified according to the 
expression of type II MADS-box genes. The organ types (vegetative organs and 
floral organs) were matched to the expression patterns of type II MADS-box 


Notably, the N. colorata AGL6 homologue is mainly expressed in sepals 
and petals, whereas the FUL homologue is mainly expressed in carpels, 
suggesting that AGL6 acts as an A-function gene in WN. colorata. Thetwo 
C-function homologues AGa and AGbare highly expressed in stamens 
and carpels, respectively, whereas AGD is also expressed in sepals and 
petals, suggesting that they might have undergone subfunctionaliza- 
tion and possibly neofunctionalization for flower development after 
the Nymphaealean WGD. Furthermore, the ABCE homologues in N. 
colorata generally exhibit wider ranges of expression in floral organs 
than their counterparts in eudicot model systems (Fig. 3b). This wider 
expression pattern, in combination with broader expression of at least 
some ABCE genes in some eudicots representing an early-diverging 
lineage”, some monocots” and magnoliids”, suggest an ancient ABCE 
model for flower development, with subsequent canalization of gene 
expression and function regulated by the more specialized ABCE genes 
during the evolution of mesangiosperms, especially core eudicots®. 
This could also account for the limited differentiation between sepals 
and petals in Nymphaeales species, and is consistent with a single type 
of perianth organ proposed in an ancestral angiosperm flower". 
Floral scent serves as olfactory cues for insect pollinators”. Whereas 
Amborella flowers are scentless’®, N. colorata flowers release 11 different 
volatile compounds, including terpenoids (sesquiterpenes), fatty- 
acid derivatives (methyl decanoate) and benzenoids (Fig. 4a). The N. 
colorata genome contains 92 putative terpene synthase (7PS) genes, 
which are ascribed to four previously recognized TPS subfamilies in 
angiosperms: TPS-b, TPS-c, TPS-e/f and TPS-g (Fig. 4b), but none was 
found for TPS-a, which is responsible for sesquiterpene biosynthesis 
in mesangiosperms”. Notably, TPS-b contains more than 80 genes in 
N. colorata; NC11G0123420 is highly expressed in flowers (Extended 
Data Fig. 7); this result suggests that it may be a candidate gene for 
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sesquiterpene biosynthase in N. colorata. Also, methyl decanoate has 
not been detected as a volatile compound in monocots and eudicots® 
and is thought to be synthesized in N. colorata by the SABATH family 
of methyltransferases”. The N. colorata genome contains 13 SABATH 
homologues and 12 of them forma Nymphaeales-specific group (Sup- 
plementary Fig. 41). Among these 12 members, NC11G0120830 showed 
the highest expression in petals (Fig. 4c) and its corresponding recom- 
binant protein was demonstrated to bea fatty acid methyltransferase 
that had the highest activity with decanoic acid as the substrate (Fig. 4d, 
Supplementary Note 7.1). These results suggest that the floral scent 
biosynthesis in N. colorata has been accomplished through enzymatic 
functions that have evolved independently from those in mesangio- 
sperms (Fig. 4e). 

Nymphaea colorata is valued for the aesthetically attractive blue 
colour of petals, whichis a rare trait in ornamentals. To understand the 
molecular basis of the blue colour, we identified delphinidin 3’-O-(2”- 
O-galloyl-6”-O-acetyl-B-galactopyranoside) as the main blue anthocya- 
nidin pigment (Extended Data Fig. 8a-c). By comparing the expression 
profiles between two N. colorata cultivars with white and blue petals 
for genes ina reconstructed anthocyanidin biosynthesis pathway, we 
found genes for an anthocyanidin synthase and a delphinidin-modi- 
fication enzyme, the expression of which was significantly higher in 
blue petals than in white petals (Extended Data Fig. 8d, e). These two 
enzymes catalyse the last two steps of anthocyanidin biosynthesis and 
are therefore key enzymes specialized in blue pigment biosynthesis” 
(Supplementary Note 7.2). 

Water lilies have a global distribution that includes cold regions 
(northern China and northern Canada), unlike the other ANA-grade 
angiosperms Amborella (Pacific Islands) and Austrobaileyales (tem- 
perate and tropical regions). We detected marked expansions of genes 
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Fig. 4 | Floral scent and biosynthesis in N. colorata.a, Gas chromatogram of 
floral volatiles from the flower of N. colorata. The internal standard (IS) is nonyl 
acetate. Methyl esters are in blue; terpenes areinred. Floral scent was 
measured three times independently with similar results. b, Phylogenetic tree 
of terpene synthases from N. colorata and representative plants showing the 
subfamilies from a-h and x.c, Expression analysis of SABATH genes of 
N.colorata showed that NC11G0120830 had the highest expression level in petal. 


related to immunity and stress responses in N. colorata, including 
genes encoding nucleotide-binding leucine-rich repeat (NLR) proteins, 
protein kinases and WRKY transcription factors, compared with those 
in Amborella and some mesangiosperms (Extended Data Fig. 9, Sup- 
plementary Note 8). Itis possible that increased numbers of these genes 
enabled water lilies to adapt to various ecological habitats globally. 

Inconclusion, the N. colorata genome offers a reference for compara- 
tive genomics and for resolving the deep phylogenetic relationships 
among the ANA-grade and mesangiosperms. It has also revealed a WGD 
specific to Nymphaeales, and provides insights into the early evolution 
of angiosperms on key innovations such as flower development and 
floral scent and colour. 
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Methods 


Genome and transcriptome sequencing 
Total DNA for genome sequencing was extracted from young leaves. 
Leaf RNA was extracted from 18 water lily species: N. colorata, Eury- 
ale ferox, Brasenia schreberi, Victoria cruziana, Nymphaea mexicana, 
Nymphaea prolifera, Nymphaea tetragona, Nymphaea potamophila, 
Nymphaea caerulea, Nymphaea rubra, Nymphaea ‘midnight’, Nym- 
phaea ‘Choolarp’, Nymphaea ‘Paramee’, Nymphaea ‘Woods blue god- 
dess’, Nymphaea gigantea ‘Albert de Lestang’, N. gigantea ‘Hybrid I’, 
Nymphaea ‘Thong Garnjana’ and Nuphar lutea. In addition, for tran- 
scriptome sequencing we sampled several organs and tissues from 
N. colorata including mature leaf, mature leafstalk, juvenile flower, 
juvenile leaf, juvenile leafstalk, carpel, stamen, sepal, petal and root. 
For PacBio sequencing, we prepared approximately 20-kb SMRTbell 
libraries. A total of 34 SMRT cells and 49.8 Gb data composed of 5.5 
million reads were sequenced on PacBio RSII system with P6-C4 chem- 
istry. All transcriptome libraries were sequenced using the Illumina 
platform, generating paired-end reads. For the Hi-C sequencing and 
scaffolding, a Hi-C library was created from tender leaves of N. colorata. 
In brief, the leaves were fixed with formaldehyde and lysed, and the 
cross-linked DNA was then digested with Mbol overnight. Sticky ends 
were biotinylated and proximity-ligated to form chimeric junctions, 
which were physically sheared to and enriched for sizes of 500-700 
bp. Chimeric fragments representing the original cross-linked long- 
distance physical interactions were then processed into paired-end 
sequencing libraries and 346 million 150-bp paired-end reads, which 
were sequenced on the Illumina platform. 


Sequence assembly and gene annotation 

To assemble the 49.8 Gb data composed of 5.5 million reads, we filtered 
the reads to remove organellar DNA, reads of poor quality or short 
length, and chimaeras. The contig-level assembly was performed on 
full PacBio long reads using the Canu package”. Canu v.1.3 was used 
for self-correction and assembly. We then polished the draft assem- 
bly using Arrow (https://github.com/PacificBiosciences/Genomic- 
Consensus). To increase the accuracy of the assembly, Illumina short 
reads were recruited for further polishing with the Pilon program 
(https://github.com/broadinstitute/pilon). The genome assembly 
quality was measured using BUSCO (Benchmarking Universal Sin- 
gle-Copy Orthologues)” v.3.0. The paired-end reads from Hi-C were 
uniquely mapped onto the draft assembly contigs, which were grouped 
into chromosomes and scaffolded using the software Lachesis (https:// 
github.com/shendurelab/LACHESIS). 

Genscan (http://genes.mit.edu/GENSCAN.html) and Augustus” were 
used to carry out de novo predictions with gene model parameters 
trained from Arabidopsis thaliana. Furthermore, gene models were 
de novo predicted using MAKER”. We then evaluated the genes by 
comparing MAKER results with the corresponding transcript evidence 
to select gene models that were the most consistent onthe basis of an 
AED metric. 


The evolutionary position of water lily and divergence-time 
estimation 

LCN genes were identified based on OrthoFinder” results. The 
orthologues were obtained from six monocots (Spirodela polyrhiza, 
Zostera marina, Musa acuminata, Ananas comosus, Sorghum bicolor 
and Oryza sativa) and six eudicots (Nelumbo nucifera, Vitis vinifera, 
Populus trichocarpa, A. thaliana, Solanum lycopersicum and Beta vul- 
garis), N.colorata, Amborella, andthe gymnosperms G. biloba, P. abies 
and P. taeda. LCN genes needed to meet the following requirements: 
strictly single-copy in N. colorata, Amborella, G. biloba, P. abies or 
P. taeda, and single-copy in at least five of the 12 eudicots or monocots. 
With G. biloba, P. abies or P. taeda as the outgroup, we identified 2,169, 
1,535 and 1,515 orthologous LCN genes, respectively. Furthermore, we 


trimmed the sites with less than 90% coverage. LCN gene trees were 
estimated from the remaining sites using RAXML v.7.7.8 using the 
GTR+G+I model for nucleotide sequences (Fig. 1c) and the JTT+G+1 
model for amino acid sequences (Supplementary Note 4.1). To account 
for incomplete lineage sorting and different substitution rates, we 
applied the multispecies coalescent model and a supermatrix method, 
respectively, to the LCN genes and found further support for the sister 
relationship between Amborella and all other extant flowering plants 
(Supplementary Note 4.2). 

We further carefully selected five LCN gene sets (1,167, 834, 683, 
602 and 445) from 115 species and applied both a supermatrix 
method” ” and the multi-species coalescent model to infer the phy- 
logeny of angiosperms (Supplementary Note 4.2). The phylogeny 
inferred from 1,167 LCN genes is shown in Fig. 1d, with different sup- 
port values from the multi-species coalescent analyses of the other 
four LCN gene sets. 

To estimate the evolutionary timescale of angiosperms, we cali- 
brated a relaxed molecular clock using 21 fossil-based age constraints’ 
throughout the tree, including the earliest fossil tricoplate pollen 
(approximately 125 Ma) associated with eudicots*°. We concatenated 
101 selected genes (205,185 sites) and fixed the tree topology to that 
inferred from our coalescent-based analysis of 1,167 genes from 115 
taxa. We performed a Bayesian phylogenomic dating analysis of the 
101 selected genes in MCMCtree, part of the PAML package”, and 
used approximate likelihood calculation for the branch lengths”. 
Molecular dating was performed using an auto-correlated model of 
among-lineage rate variation, the GTR substitution model, and a uni- 
form prior on the relative node times. Posterior distributions of node 
ages were estimated using Markov chain Monte Carlo sampling, with 
samples drawn every 250 steps over 10 million steps following a burn-in 
of 500,000 steps. We checked for convergence by running the analysis 
in duplicate and checked for sufficient sampling. 

We also implemented the penalized likelihood method under a vari- 
able substitution rate using TreePL™ and r8s*, as a constant substitu- 
tion rate across the phylogenetic tree was rejected (P< 0.01) for all 
cases by likelihood-ratio tests in PAUP”. Three fossil calibrations, corre- 
sponding to the crown groups of Lamiales, Cornales and Laurales, were 
implemented as minimum age constraints in our penalized likelihood 
dating analysis, except that the earliest appearance of tricolpate pollen 
grains (about 125 Ma)” was used to fix the age of crown eudicots. We 
determined the best smoothing parameter value of the concatenated 
101 LCN genes as 0.32 by performing cross-validations of a range of 
smooth parameters from 0.01 to 10,000 (algorithm = TN; crossv = yes; 
cvstart =—2; cvinc = 0.5; cvnum=15). We used 100 bootstrap trees with 
branch lengths generated by RAXML” to infer the 95% confidence inter- 
vals of age estimates (Supplementary Note 4.2). 


Identification of WGD 

The N. colorata genome was compared with each of the other genomes 
by pairwise alignment using Large-Scale Genome Alignment Tool (LAST; 
http://last.cbrc.jp/). We defined syntenic blocks using LAST hits witha 
distance cut-off of 20 genes apart from the two retained homologous 
pairs, in which at least four consecutive retained homologous pairs 
were required. We then obtained the one-to-one blocks to exclude 
ancient duplication blocks with QUOTA-ALIGN*’. 

K;-based paralogue age distributions were constructed as previously 
described’. In brief, the paranome was constructed by performing an 
all-against-all protein sequence similarity search using BLASTP with 
an E-value cut-off of 10”, after which gene families were built with the 
mclblastline pipeline (v.10-201) (micans.org/mcl). Each gene family 
was aligned using MUSCLE (v.3.8.31)*°, and K, estimates for all pairwise 
comparisons within a gene family were obtained using maximum like- 
lihood in the CODEML program“ of the PAML package (v.4.4c)*". We 
then subdivided gene families into subfamilies for which K, estimates 
between members did not exceed a value of 5. 
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To correct for the redundancy of K; values (a gene family of n mem- 
bers produces n(n -1)/2 pairwise K, estimates for n—-1retained duplica- 
tion events), we inferred a phylogenetic tree for each subfamily using 
PhyML* with the default settings. For each duplication node in the 
resulting phylogenetic tree, all mK, estimates between the two child 
clades were added to the K, distribution with a weight of 1/m (in which 
mis the number of K, estimates for a duplication event), so that the 
weights of all K, estimates for a single duplication event summed to 
one. Paralogous gene pairs found in duplicated collinear segments 
(anchor pairs) from N. colorata were detected using i-ADHoRe (v.3.0) 
with ‘level_2_only = TRUE**“*, The identified anchor pairs are assumed 
to correspond to the most recent WGD event. 

The K,-based orthologue age distributions were constructed by iden- 
tifying one-to-one orthologues between species using InParanoid*s 
with default settings, followed by K, estimation using the CODEML 
programas above. K, distributions for one-to-one orthologues between 
N. colorata and each of V. cruziana, N. advena, C. caroliniana, I. henryi 
and Amborella were used to compare the relative timing of the WGD in 
N.colorata with speciation events within Nymphaeales. K, distributions 
for one-to-one orthologues between the outgroup species /. henryi 
and each of N. lutea, N. advena, N. mexicana, Nymphaea ‘Woods blue 
goddess’, N. colorata, and C. caroliniana were used to estimate and com- 
pare relative substitution rates among these Nymphaealean species. 
Additional comparisons using V. vinifera and Amborella as outgroup 
species instead of /. henryi gave similar results (data not shown). 

Absolute dating of the identified WGD event in N. colorata was per- 
formedas previously described“. Briefly, paralogous gene pairs located 
in duplicated segments (anchor pairs) and duplicated pairs lying under 
the WGD peak (peak-based duplicates) were collected for phylogenetic 
dating. We selected anchor pairs and peak-based duplicates present 
under the N. colorata WGD peak and with K; values between 0.7 and 1.2 
(grey-shaded area in Extended Data Fig. 2b) for absolute dating. For 
each WGD paralogous pair, an orthogroup was created that included 
the two paralogues plus several orthologues from other plant spe- 
cies as identified by InParanoid* using a broad taxonomic sampling: 
one representative orthologue from the order Cucurbitales, two from 
Rosales, two from Fabales, two from Malpighiales, two from Brassicales, 
one from Malvales, one from Solanales, two from Poaceae (Poales), one 
from A. comosus”’ (Bromeliaceae, Poales), one from either M. acumi- 
nata*® (Zingiberales) or Phoenix dactylifera® (Arecales), one from the 
Asparagales (from Asparagus officinalis, Apostasia shenzhenica*®, or 
Phalaenopsis equestris*), one from the Alismatales (either from S. pol- 
yrhiza™ or Z. marina®), one from Amborella, and one from G. biloba™. 
In total, 217 orthogroups based on anchor pairs and 142 orthogroups 
based on peak-based duplicates were collected. 

The node joining the two WGD paralogues of N. colorata was then 
dated using the BEAST v1.7 package® under an uncorrelated relaxed- 
clock model and anLG+G model with four site-rate categories. A starting 
tree with branch lengths satisfying all fossil prior constraints was cre- 
ated according to the consensus APGIV phylogeny’. Fossil calibrations 
were implemented using log-normal calibration priors on the following 
nodes: the node uniting the Malvidae based on the fossil Dressiantha 
bicarpellata® with prior offset = 82.8, mean = 3.8528, and s.d. = 0.5%; 
the node uniting the Fabidae based on the fossil Paleoclusia chevalieri® 
with prior offset = 82.8, mean = 3.9314, and s.d. = 0.5°; the node unit- 
ing the non-Alismatalean monocots based on fossil Liliacidites® with 
prior offset = 93.0, mean = 3.5458, ands.d.= 0.5%; the node uniting the 
N.colorata WGD paralogues with the eudicots and monocots based on 
the sudden abundant appearance of eudicot tricolpate pollen in the 
fossil record with prior offset = 124, mean = 4.8143 ands.d.=0.5%; and 
the root uniting the above clades with Amborella and then G. biloba 
with prior offset = 307, mean = 3.8876, and s.d. = 0.5. The offsets of 
these calibrations represent hard minimum boundaries, and their 
means represent locations for their respective peak mass prob- 
abilities in accordance with previous dating studies of these specific 


clades® (see Supplementary Note 5.3 for an alternative setting of 
orthogroups). 

Arun without data was performed to ensure proper placements of 
the marginal calibration priors, which do not necessarily correspond 
to the calibration priors specified above, because they interact with 
each other and the tree prior™. Indeed, a run without data indicated 
that the distribution of the marginal calibration prior for the root did 
not correspond tothe specified calibration density, so we reduced the 
mean in the calibration prior of the node combining the N. colorata 
WGD paralogues with the eudicots and monocots with offset = 124, 
mean = 4.4397, s.d. = 0.5 to locate the marginal calibration prior at 
220 Ma”. 

Markov chain Monte Carlo sampling for each orthogroup was run 
for 10 million steps, with sampling every 1,000 steps to produce a 
sample size of 10,000. The resulting trace files were inspected using 
Tracer v.1.5°, with a burn-in of 1.000 samples, to check for convergence 
and sufficient sampling (minimum effective sample size of 200 for all 
parameters). In total, 263 orthogroups were accepted, and absolute 
age estimates of the node uniting the WGD paralogous pairs based on 
both anchor pairs and peak-based duplicates were grouped into one 
absolute age distribution, for which kernel density estimation and a 
bootstrapping procedure were used to find the peak consensus WGD 
age estimate and its 90% confidence interval boundaries, respectively. 
More detailed methods have been previously described”. 

To identify the duplication events that resulted in the 2,648 anchor 
pairs detected in the genome of N. colorata, we performed phylog- 
enomic analyses to determine the timing of the duplication events 
relative to the lineage divergences in Nymphaeales as described pre- 
viously**. Protein-coding genes from 12 species were used, including 
eight species from Nymphaeaceae and one species from Cabombaceae 
in Nymphaeales, one species (/. henryi) from Austrobaileyales, plus 
Amborellaand G. biloba. The phylogeny of the 12 species was obtained 
from Fig. 1d, and the branch lengths in K, units were estimated from 23 
LCN genes (selected from the 101 LCN genes used in Fig. 1d, because only 
23 are shared across all of the species studied) using PAML”™ under the 
free-ratio model. OrthoMCL (v.2.0.9)® was used with default param- 
eters to identify gene families. Then, we removed 907 of the 2,648 
anchor pairs with K, values greater than five. If the remaining anchor 
pairs fell into different gene families, thus indicating incorrect assign- 
ment of gene families by OrthoMCL, we merged the corresponding gene 
families and finally obtained 53,243 multi-gene gene families. Next, 
phylogenetic trees were constructed for a subset of 881 gene families 
with no more than 200 genes that had at least one pair of anchors and 
one gene fromG. biloba. Multiple sequence alignments were produced 
by MUSCLE (v3.8.31)*° and were trimmed by trimAl (v.1.4) to remove 
low-quality regions based on a heuristic approach (-automated1). 

We then used RAXML (v.8.2.0)° with the GTR+G model to estimatea 
maximum-likelihood tree, starting with 200 rapid bootstraps followed 
by maximum-likelihood optimizations on every fifth bootstrap tree. 
Gene trees were rooted based on genes from G. biloba if these formed 
a monophyletic group in the tree; otherwise, mid-point rooting was 
applied. The timing of the duplication event for each anchor pair rela- 
tive to the lineage divergence events was then inferred. In brief, inter- 
nodes from a gene tree were first mapped to the species phylogeny 
according to the common ancestor of the genes in the gene tree. Each 
internode was then classified as a duplication node, aspeciation node, 
or anode that has no paralogues and is inconsistent with divergence 
in the species phylogeny. The parental node(s) of a duplication node 
supported by an anchor pair were traced towards the root until reaching 
aspeciation node inthe gene tree. The duplication event that resulted 
in the anchor pair was hence circumscribed between the duplication 
node as the lower bound and the speciation node as the upper bound 
onthe species tree. Ifthe two nodes were directly connected by asingle 
branch on the species tree, the duplication was thus considered to 
have occurred on the branch. To reduce biased estimations, we used 


the bootstrap value on the branch leading to a duplication node as 
support for a duplication event. In total, 497 anchor pairs in 473 gene 
families coalesced as duplication events on the species phylogeny, and 
duplication events from 254 anchor pairs in 246 gene families (or from 
380 anchor pairs in 364 gene families) had bootstrap values greater 
than or equal to 80% (or 50%). 


Floral scent measurement, gene identification, and functional 
characterization 

We collected floral volatiles of N. colorata using a dynamic headspace 
sampling system and analysed them using gas chromatography-mass 
spectrometry (GC-MS) as previously described™. After 2 h of collec- 
tion from the headspace of detached open flowers of N. colorata ina 
glass chamber (10 cm diameter, 30 cm height), volatiles were eluted 
from the SuperQ volatile collection trap using 100 pl of methylene 
chloride containing nonyl acetate as an internal standard. We then 
analysed samples using an Agilent Intuvo 9000 GC system coupled 
with an Agilent 7000D Triple Quadrupole mass detector. Separation 
was performed on an Agilent HP 5 MS capillary column (30 m x 0.25 
mm) with helium as carrier gas (flow rate of 1 ml min”). We applied 
splitless injections of 1 pl samples, injection temperature of 250 °C, 
an initial oven temperature of 40 °C (3-min hold) and a temperature 
gradient of 5 °C per minincrease from 40 °C to 250 °C. Products were 
identified using the National Institute of Standards and Technology 
mass spectral database (https://chemdata.nist.gov). 

A full-length cDNA of NC11G0120830 was amplified from the open 
flowers of N. colorata using reverse transcription PCR (RT-PCR), and 
cloned into pET-32a (MilliporeSigma). After confirmation by sequenc- 
ing, NC11G0120830 was expressed in F. coli strain BL21 (DE3) (Strata- 
gene) and the recombinant protein produced was purified using a 
modified nickel-nitrilotriacetic acid agarose (Invitrogen) protocol as 
previously reported”. For methyltransferase enzyme assays, we used 
both radiochemical and non-radiochemical reaction systems. The 
radiochemical reaction system (50 pl) was composed of 50 mM Tris- 
HCI, pH 7.8, 1mM substrate, 1 pI “*C-S-adenosyl-L-methionine, and 1 pl 
of purified NC11G0120830. After 30 min of incubation at room tem- 
perature, 150 pl of ethyl acetate was added to extract the “C-labelled 
reaction products. The extracts were counted using a scintillation 
counter (Beckman Coulter) to measure the activity of NC11G0120830. 
To determine the chemical identity of the reaction product, we per- 
formed non-radiochemical assays in which nonradioactive S-adenosyl- 
L-methionine was used as the methyl donor. The reaction product was 
collected by headspace solid-phase microextraction and analysed by 
GC-MS as previously described”. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


PacBio whole-genome sequencing data, Illumina data and genome 
assembly sequences have been deposited to the NCBI Sequence Read 
Archive (SRA) as Bioproject PRJNA565347, and were also deposited 
in the BIG Data Center (http://bigd.big.ac.cn) under project number 
PRJCA001283. The genome assembly sequences and gene annotations 
have been deposited in the Genome Warehouse in BIG Data Center 
under accession number GWHAAYWO0000000. The genome assembly 
sequences, gene annotations, and the LCN genes used in this study, 
have been also deposited in the Waterlily Pond (http://waterlily.eplant. 
org). All other data are available from the corresponding author upon 
reasonable request. 
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a, chromosomes c, TE density e, GC content - 0.3 


b, gene density d, loga(FPKM(juvenile flower)+1) f, intragenomic synteny 
Extended Data Fig. 1| High-quality genome ofN. colorataallowsintegration juvenile flower, expression values were transformed with log,(FPKM + 1).e, GC 
of genetic and expression data. a, The assembled 14 chromosomes. b, Gene content plotted ina100-kb sliding window. f, Intragenomic syntenic regions 
density plotted ina100-kb sliding window. c, Transposable element (TE) denoted by asingle line represent a genomic syntenic region covering at least 


density plotted ina100-kb sliding window. d, Gene expression atlas of the 20 paralogues. 
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Amborella trichopoda, 53 longest scaffolds 
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Extended Data Fig. 2| WGD in Nymphaeales. a, Intergenomic synteny 
between WN. colorata (14 chromosomes), Amborella (53 longest scaffolds), 

and the eudicots N. nucifera (8 longest megascaffolds) and V. vinifera 

(19 chromosomes). Five adjacent anchor pairs were plotted as one syntenic 
line. Coloured lines represent one example of syntenic genes found in other 
species that correspond to one copy inAmborella, two inN. colorata, twoin 
N. nucifera, and three in V. vinifera. b, K; distribution for the whole paranome 
of N. colorata. The light grey rectangle in the background indicates the K, 
boundaries used to extract duplicate pairs for absolute phylogenomic dating 
of the WGD event, and also highlights the range in which WGD peaks can be 
identified in other species of Nymphaeaceae (Supplementary Note 5.2). 

c, Kernel-density estimates of K, distributions for one-to-one orthologues 
between the outgroup species /. henryiand each of N. lutea and N. advena (red), 


N.colorata, N. mexicana and Nymphaea ‘Woods blue goddess’ (blue) and 

C. caroliniana (yellow). As each peak represents the same divergence event in 
the angiosperm phylogeny, the differences observed among the K, values of 
the peaks indicate substantial substitution rate variation among these 
Nymphaealean lineages (see also Fig. 2b). d, Absolute age distribution obtained 
from phylogenomic dating of N. colorata WGD paralogues based on 
orthogroups with orthologues from Amborella and G. biloba. The solid black 
line represents the kernel density estimate of paralogue date estimates, and 
the vertical dashed black line represents its peak at 107 Ma. The grey lines 
represent density estimates from 2,500 bootstrap replicates and the vertical 
black dotted lines represent the corresponding 90% confidence interval for the 
WGD age estimate, 117-98 Ma (see Methods). The blue histogram shows the raw 
distribution of divergence date estimates for paralogues. 
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Extended Data Fig. 3| The phylogenetic tree of MADS-box genes of 
N.colorata.a, The MADS-box genes are divided into typel and type ll, and the 
latter was subdivided into MIKC‘ and MIKC*. Branches of various species are 
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Extended Data Fig. 6 | Nymphaealean-specific duplication of the genes that of the G/(b), CO (c) and FLC (d) gene family across various water lily species and 
control the initiation of flowering. a, Phylogenetic tree of the PEBP-domain other representative seed plants. e, The regulatory pathway for the flowering 
containing gene family, including FT, TFL1 and MFT subfamilies across various time control. The red-labelled gene has two copies in N. colorata and is retained 
water lily species and other representative seed plants. b-d, Phylogenetic tree by nymphaealean-specific WGD. 
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Extended Data Fig. 7| Explosive expansion of the TPS-b subfamily and its 
implications. a, The phylogenetic classification of TPS-b subfamily into three 
groups. b, The group II member NC11G0123420 is the sole gene with high 
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expression in the petal. c, Whereas most TPS-b members lack the two typical 
catalytic motifs, the NC11G0123420 retained both motifs, suggesting its 
potential role in producing sesquiterpene in WN. colorata. 
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Extended Data Fig. 8| The blue anthocyanidin and its potential biosynthesis 
pathway inN. colorata.a, The peak of the blue anthocyanidin appears at 3 min 
of the high-performance liquid chromatography (HPLC) detection. 

b, The three fragments of the blue anthocyanidin and their molecule mass. 

c, The molecule of the anthocyanidin was identified as delphinidin 3’-O-(2”-O- 
galloyl-6”-O-acetyl-B-galactopyranoside), abbreviated as Dp3’galloyl- 
acetylGal. d, The postulated pathway for the biosynthesis of Dp3’galloyl- 
acetylGal. Gene copy numbers are listed next to the enzymes. 3GGT, 


anthocyanidin 3-O-glucoside-2”-O-glucosyltransferase; 3’GT, 3’-O-beta- 
glucosyltranferase; SAT, anthocyanin-5-aromatic acyltransferase; ANS, 
anthocyanidin synthase; CHI, chalcone isomerase; CHS, chalcone synthase; 
DFR, dihydroflavonol-4-reductase; F3H, flavanone-3-hydroxylase; F3’5’H, 
flavonoid-3’,5’-hydroxylase. e, Comparative transcriptomic analyses between 
the blue- and white-petal cultivars of NV. colorata identified two genes, ANS and 
UDPGT, that are highly differentially expressed and might be potential 
regulators for blue coloration of the petals. 
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Extended Data Fig. 9 | Expanded stress-related and transcription factor expanded in all of its three subfamilies (RNL, TNL and CNL). c, Distribution of 
gene familiesin the genome of N. colorata. a, Markedly expanded gene NLR genes across the representative algae and land plants. The background 
families for stress response and transcriptional regulation. NLR genes contain colours indicate the number variation in each species. d, An example showing 
NB-ARC domains. Notably, N. colorata encodes the highest proportion of how tandem duplication and WGD contributed to the expansion of R genesin 


kinase genes compared with gymnosperms or other land plants. b, NLR genes N.colorata. 
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Extended Data Table 1 | Statistics of the sequenced and assembled genome of N. colorata 


Statistic Reads* Contigs Scaffolds Chromosomes 
Number 5,521,269 1,429 804 14 

Longest Length 78 Kb 12.79Mb 44.61Mb 44.61 Mb 

Total size 49.76 Gb 409.09Mb 409.15Mb 378.81 Mb 


N50 12.59 Kb 2.14 Mb 25.52Mb 27.06 Mb 


*The reads only include sequencing by PacBio RS II SMRT sequencing technology. 
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the methods. All softwares or scripts are available from official websites or GitHub as indicated in the methods. 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 
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Policy information about availability of data 


All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 
- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


PacBio whole-genome sequencing data and Illumina data were deposited to the SRA at the NCBI under the BioProject ID PRJNA565347. 

PacBio whole-genome sequencing data and Illumina data also were deposited in the BIG Data Center (http://bigd.big.ac.cn) under project number PRJCAO01283. 
The genome assembly sequences and gene annotations have been deposited in the Genome Warehouse in BIG Data Center under accession number 
GWHAAYWO00000000 and in ENA BioProject (PRJEB34452). The genome assembly sequences and gene annotations have been also deposited in the Waterlily Pond 
(http://waterlily.eplant.org). All these data are freely available to the public. 
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Methodology 


Sample preparation Nuclei were isolated from young leaves in spring ,using PI staining for 15 minutes. 
Instrument Beckman Coulter COULTER EPICS XL™ 
Software FACS data analyses were performed using CXP v2.2 Software 


Cell population abundance abundance >8000 cells were collected for each sample. Total nuclei populations were gated using relative fluorescence intensity: 
the 
proportions of nuclei with different ploidy levels were determined based on their relative fluorescence intensity: Pear is a diploid 
(2N) as a reference, according to the peak position (Supplementary Figure 5). 
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Gating strategy Total nuclei populations were gated using PI intensity. In Pl+ singles cells, the proportions of nuclei with different ploidy levels 
were determined based on their PI intensity (Supplementary Figure 5). 


Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 


=) 
fev) 
a 
e 
= 
o 
= 
o 
Wn 
o 
fev) 
= 
Oa 
= 
= 
o 
©) 
oO 
a 
= 
a 
n 
S 
S} 
5} 
fev) 
5 
< 


Article 


RGF1 controls root meristem size through 


ROS signalling 


https://doi.org/10.1038/s41586-019-1819-6 


Received: 30 November 2017 


Masashi Yamada’”“, Xinwei Han" & Philip N. Benfey™* 


Accepted: 22 October 2019 


Published online: 4 December 2019 


The stem cell niche and the size of the root meristem in plants are maintained by 
intercellular interactions and signalling networks involving a peptide hormone, root 


meristem growth factor 1 (RGF1)'. Understanding how RGF1 regulates the 
development of the root meristem is essential for understanding stem cell function. 
Although five receptors for RGF1 have been identified” *, the downstream signalling 
mechanism remains unknown. Here we reporta series of signalling events that follow 
RGF activity. We find that the RGF1-receptor pathway controls the distribution of 
reactive oxygen species (ROS) along the developmental zones of the Arabidopsis root. 
We identify a previously uncharacterized transcription factor, RGF1-INDUCIBLE 
TRANSCRIPTION FACTOR 1 (RITF1), that has a central role in mediating RGF1 signalling. 
Manipulating R/7F1 expression leads to the redistribution of ROS along the root 
developmental zones. Changes in ROS distribution in turn enhance the stability of the 
PLETHORA2 protein, a master regulator of root stem cells. Our results thus clearly 
depict a signalling cascade that is initiated by RGF1, linking this peptide to 
mechanisms that regulate ROS. 


Plant roots encounter varying environmental conditions and respond 
by altering their growth. Root growth arises through controlled cell 
division in the root’s meristematic zone (equivalent to the transit 
amplifying zone in animals). After division, most cells increase their 
size in the elongation zone, and mature in the differentiation zone. 
The sizes of these developmental zones are determined by intrinsic 
and extrinsic signals. ROS are an intrinsic signal for establishing the 
size of the meristematic zone: superoxide (O, ) accumulates primar- 
ily in the meristematic zone, hydrogen peroxide (H,O,) accumulates 
mainly in the differentiation zone* and the balance between O, and 
H,O, modulates the transition from proliferation to differentiation’. 
The RGF1 peptide is essential in controlling the size of the meris- 
tematic zone, acting as both an intrinsic and an extrinsic signal’”*. 
Treating roots with RGF1 increases the size of the meristematic zone, 
and the Arabidopsis rgf1/2/3 triple mutant has a smaller meristematic 
zone’. Quintuple mutants of the rgf1 receptor (rgfr) lack most cells in 
the root meristem andare insensitive to RGF (refs.?*). RGF1 signalling 
controls the stability of the PLETHORA (PLT) 1/2 proteins’, which are 
required for stem cell maintenance’. However, it is not known how RGF1 
modulates the size of the meristematic zone and the stability of PLT1/2. 
We began by treating Arabidopsis roots with RGFI1, and detected green 
fluorescent protein (GFP)-labelled HIGH PLOIDY2 (HPY2)"° (a marker 
protein specific to the meristematic zone) in an enlarged area that 
correlates with a larger meristematic zone (Extended Data Fig. la—c), 
suggesting that RGF1 controls gene expression primarily in this zone. 
Therefore, to identify target genes that are downstream of RGF1, we 
isolated the meristematic zone 1h after RGF1 treatment (Extended 
Data Fig. 1d). Given that HPY2-GFP expression and the size of the mer- 
istematic zone were unchanged in this time period, we can exclude the 
possibility that an enlarged meristem is the reason for any changes in 


RNA levels. RNA-sequencing (RNA-seq) profiling found 583 genes that 
were differentially expressed between the RGF1-treatment and mock- 
treatment scenarios (Supplementary Table 1). Gene Ontology highly 
enriched categories included ‘glutathione transferase activity’ and ‘oxi- 
doreductase activity’ (Extended Data Fig. 2and Supplementary Table 2), 
suggesting that RGF1 might signal through an ROS intermediate. 

To examine the relationship between RGF1 and ROS signal- 
ling, we analysed the distribution of O, and H,O, after RGF1 
treatment. A specific indicator for H,0,—namely H,O,-3’-O-acetyl-6’- 
O-pentafluorobenzenesulfonyl-2’-7’-difluorofluorescein-Ac (H,0,-BES- 
Ac)°—exhibited lower fluorescence in the meristematic and elongation 
zones 24 h after RGF1 treatment (Fig. 1a, c). We detected O, signals by 
nitro blue tetrazolium (NBT) staining’ and observed these signals more 
broadly inthe meristematic zone 24 h after RGF1 treatment (Fig. 1b, d). 
Inthe RGF1-receptor mutant rgfr1/2/3, the meristematic zone of whichis 
unchanged after RGF1 treatment (Fig. le), levels of H,O, and O, were 
comparable between mock and RGF1 treatments (Fig. le-h). 

To identify downstream factors in the RGF1 and ROS signalling 
pathway, we combined our RGFI1 transcriptome data with develop- 
mental-zone-specific transcriptome data”. Among genes that are 
both meristematic-zone-specific and induced by RGF1, we identified 
PLANT AT-RICH SEQUENCE AND ZINC-BINDING TRANSCRIPTION FAC- 
TOR (PLATZ) FAMILY PROTEIN (AT2G12646), the expression of which 
increased approximately twofold after 1h of RGF1 treatment (Fig. 2a). 
We named this gene RGFI-INDUCIBLE TRANSCRIPTION FACTOR 1 (RITF1), 
and found that its expression occurs predominantly in the meristematic 
zone" (Fig. 2b). Quantitative reverse transcription with polymerase 
chain reaction (RT-PCR) showed that the abundance of the R/7F1 tran- 
script increased approximately twofold in wild-type roots 1h after RGF1 
treatment, and was maintained at 6 h and 24 h (Fig. 2c). By contrast, 
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Fig. 1| Distribution of ROS levels upon RGF1 treatment. a, Confocal images of 
roots 24 hafter mock treatment or treatment with 20 nM RGF1. Propidium 
iodide (PI) staining is in red; H,O,-BES-Ac fluorescence is in green. b, Roots 
stained with NBT 24 h after mock treatment or treatment with 20 nMRGF1. 

c, Quantification of H,O,-BES-Ac intensity in the meristematic zone (n=6 
independent roots; P<0.003).d, Quantification of NBT staining intensity (in 
arbitrary units, AU) inthe meristematic zone (n=7 independent roots; 

P=3,16 x10). e, Confocal images of roots 24 hafter mock treatment or 
treatment with 20 nM RGF1in wild-type roots (Columbia-0 (Col-0) 
background) or rgfr1/2/3 mutants. Staining as ina. f, Quantification of H,O,- 
BES-Ac staining intensity in the meristematic zone in wild-type and rgfr1/2/3 
roots (n=5 independent roots; *P< 0.025). g, Wild-type or rgfr 1/2/3 roots 
stained with NBT 24 hafter mock treatment or treatment with 20 nMRGFI1. 

h, Quantification of NBT staining intensity in the meristematic zone of wild- 
type or rgfr 1/2/3 roots (n=5 independent roots; *P=1.65 x 10 °°). White and blue 
arrowheads indicate the junction between the meristematic and elongation 
zones. Scale bar, 50 pm. Bar graphs show means. Error bars show+s.d. Dots 
indicate each data point. Pvalues are calculated by two-sided Student’s t-test. 
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Fig. 2| Expression of RITF1 and phenotype of RITF1 overexpression line. 

a, Expression of R/TF1 in the meristematic zone Lh after treatment withl100 nM 
RGF1, measured by RNA-seq (CPM, counts per million mapped reads; n=3 
independent experiments; P< 0.01. b, Expression of R/TF1 in developmental 
zones as measured by RNA-seq (FPKM, fragments per kilobase of transcript per 
million mapped reads). c, Expression of R/TF1 in the meristematic zone of wild- 
type and rgfr1/2/3 roots upon treatment with RGF1, measured by quantitative 
RT-PCR (n=3 independent experiments; *P< 0.001, **P< 0.002, ***P<0.02). 

d, Confocal images of pR/TF1-GFP expression and PI staining in wild-type and 
rgfr1/2/3 roots after RGF1treatment.e, Total intensity of pRITF1-GFP 
expression in wild-type and rgfr1/2/3 roots 24 hafter RGF1ltreatment (n=5 
independent roots; *P< 0.001). f, g, Confocal images of roots stained with PI (f) 
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MV ELAITET Col-0 XVE-RITF1 
and H,0,-BES-Ac (g) in Col-O and XVE-RITF1 roots after mock or oestradiol 
treatment. h, Light microscope images of NBT-stained roots after mock or 
oestradiol treatment. i, Number of cells in the meristematic zone in Col-O and 
XVE-RITF1 roots after mock or oestradiol treatment (n= 6 independent roots; 
*P<0.001).j, Average intensity of BES-H,O,-Ac in the differentiation zone after 
mock or oestradiol treatment (n= 6 independent roots; *P<0.001).k, Average 
intensity of NBT staining in the differentiation zone after mock or oestradiol 
treatment (n=7 independent roots; *P< 0.001). Scale bar, 50 pm. White and 
blue arrowheads throughout indicate the junctions between the meristematic 
and elongation zones and between the elongation and differentiation zones. 
Bar graphs show means. Error bars are+s.d. Dots indicate each data point. 
Pvalues are calculated by two-sided Student’s t-test. 
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Fig. 3|ROS signals and meristem size in RITF1 overexpression lines in 
rgfr1/2/3 roots. a, Light microscope images of NBT-stained roots with or 
without XVE-RITF1 expression in Col-0 and rfgr1/2/3 roots. b, Total intensity of 
NBT staining in the differentiation zone with or without XVE-RITF1 expression, 
in Col-O and rgfr1/2/3 roots, 24 hafter mock or oestradiol treatment (n=8 
independent roots; *P<2.0 x10). c, Confocal images of Pl-stained roots with 
or without XVE-RITF1 expression in Col-0 and rfgr1/2/3 roots. d, Percentage 
increase in the number of cells in the meristematic zone (in which 100% is the 
number of cells inthe mock treatment scenario) 24 h after oestradiol treatment 
compared with mock treatment in Col-0 roots, rgfr1/2/3 roots, and XVE-RITF1- 


RITF1 expression in rgfr1/2/3 roots was unchanged upon RGF1 treatment 
(Fig. 2c). Expression of a construct with the R/TF1 promoter driving the 
GFP-coding sequence (pRITF1-GFP) mirrored our transcriptome analy- 
sis and increased in the wild type following RGF1 treatment (Fig. 2b, d, e). 
By contrast, pR/TF1-GFP expression was very low and exhibited no 
change following RGF1 treatment in rgfr1/2/3 mutants (Fig. 2d, e). These 
data indicate that R/TF1 expression is regulated by the RGF1 pathway. 

To understand its function, we inducibly overexpressed R/TF1 using 
the oestradiol-inducible promoter system”. After 24 h of B-oestradiol 
treatment, the meristematic zone became enlarged and the number of 
cells increased (Fig. 2f, i), similarly to RGF1-treated roots (Fig. 1a). We 
also found that H,O, levels declined in all three developmental zones 
upon oestradiol treatment (Fig. 2g,j), and that enhanced O, signals were 
observed in a broader area of the meristematic zone (Fig. 2h, k), with 
ectopic O, signals inthe elongation and differentiation zones (Fig. 2h). 
Altered ROS signals and an enlarged meristem suggest that R/TF1 can 
modulate ROS signalling and root meristem size downstream of the 
RGF1 pathway. We also observed an earlier response to the induction 
of RITF1 than to RGF1 treatment. A decrease in the H,O,-BES-Ac signal 
was detected 4 h after oestradiol treatment (Extended Data Fig. 3a, b), in 
contrast to the lack of detectable change seen 4 hafter RGF1 treatment 
in either the uninduced line or in the wild type (Extended Data Fig. 3a, b). 
Changes in ROS signals were first observed at approximately 6 h after 
RGF1 treatment in those lines (Extended Data Figs. 4i,j,0, p and 5b, c). 

If RITF1 functions downstream of the RGFl-receptor pathway, 
then overexpression of RITF1 in rgfr1/2/3 mutants should rescue root 
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expressing Col-0 and rgfr1/2/3 roots (n= 6 independent roots; *P< 0.0002, 
**P< 0.0007). e, Light microscope images of roots of Col-0, ritf1-1and ritf1-2 
roots stained with NBT 24 h after 5nM RGF1 treatment. Scale bar, 50 pm. Blue 
arrowheads show the junction between the meristematic and elongation 
zones. f, Quantification of NBT staining intensity in the meristematic zonein 
Col-0, ritf1-1 and ritf1-2 roots after 5 nM RGF1 treatment (n=7 independent 
roots; *P<2.4 x 10°, **P< 0.021). Bar graphs show means. Error bars represent 
+s.d. Dots indicate each data point. Pvalues are calculated by two-sided 
Student’s ¢-test. 


meristem defects and increase root meristem size. To test this hypothe- 
sis, we inducibly overexpressed R/TF1 in rgfr1/2/3 mutants and in the wild 
type, and observed an enhanced O, signal and increased root meristem 
size in both (Fig. 3a—d). Finally, we examined two sitf1 mutant alleles. We 
generated the ritf1-1 allele using CRISPR-Cas9; it contains a frameshift 
mutation early in the coding sequence, rendering it unlikely to producea 
functional RITF1 protein. The ritf1-2 allele has atransfer-DNA insertionin 
the intron, but still shows low expression of full-length R/7F1 and is likely 
to produce low levels of a functional protein. The ritf1-1 mutant hada 
smaller meristem and lower root growth rate (Extended Data Fig. 6a, b) 
and was more resistant to RGF1 treatment than were wild-type plants 
or those with the weak allele, ritf1-2 (Extended Data Fig. 6b, c). Further, 
there was lower induction of the O, signal in ritf1-1 mutants after RGF1 
treatment than in the wild-type or ritf1-2 background (Fig. 3e, f). Taken 
together, these results strongly suggest that R/7F1is a primary regulator 
of ROS signalling and root meristem size in the RGF1 signalling pathway. 
To confirm post-translational regulation of PLT2, we compared tran- 
scriptional (pPLT2-CFP)* and translational (gPLT2-YFP)“ fusion lines (in 
which CFP and YFP arecyan and yellow fluorescent protein, respectively). 
At24 hafter RGF1treatment, we observed broader localization of gPLT2-YFP 
(Extended Data Fig. 7b), and the localization and expression of pPLT2-CFP 
were comparable between mock and RGF1treatments—even though RGFI1- 
treated roots had alarger meristematic zone (Extended Data Fig. 7a). The 
gPLT2-YFPsignal decreased more gradually and was broadly localized inthe 
larger meristematic zone after RGF1 treatment (Extended Data Fig. 7a-c). 
These results confirm that RGF1 regulates PLT2 post-translationally. 
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Fig. 4 | Stability of the PLT2 protein upon changes in oxidation conditions. 

a, d, Confocal images of gPLT2-YFP 24 h after treatment with RGF1, KI (aH,0, 
scavenger) and DPI (aninhibitor of NADPH oxidase). b, e, Localization of gPLT2- 
YFP upon treatment with RGFland KI (n=7 independent roots; *P<0.015) and 
RGFland DPI (n=7 independent roots; *P<1.5 x10°).c, f, Meristem size upon 
treatment with RGFland KI (n=7 independent roots, *P< 0.0017) and RGFland 
DPI (n=7 independent roots, *P<2.6 x10”). Bar graphs show means. Error bars 
show+s.d. Dots indicate each data point. Pvalues are calculated by two-sided 
Student’s t-test. QC, quiescent centre. 


PLT2is amember of the APETALA2/ETHYLENE-RESPONSE FACTOR 
family of transcription factors, which has previously been reported 
to be regulated by oxidative post-translational modification’ ~°. To 
determine whether modifying the oxidative conditions can increase 
the stability of the PLT2 protein, we treated the gPLT2-YFP line with 
RGF1 and potassium iodide (KI), an H,O, scavenger. We found that 
gPLT2-YFPwas localized more broadly and that meristem size was larger 
than in roots treated only with RGF1 (Fig. 4a—c). By contrast, increased 
H,O, levels inhibited the broad localization of gPLT2-YFP and reduced 
the increase in meristem size upon addition of RGF1 (Extended Data 
Fig. 8a-e). To decrease O, levels, we used alow concentration (500 nM) 
of diphenyleneiodonium (DPI), an NADPH oxidase inhibitor (Fig. 4d-f), 
resulting in a slight inhibition of PLT2 stability and slight decrease in 
meristem size (Fig. 4d-f) with little effect on root meristem develop- 
ment. However, co-treatment using RGF1 and DPI markedly reduced 
PLT2 stability and meristem size as compared with RGF1 treatment 
alone (Fig. 4d-f). Finally, we measured gPLT2-YFP,O, and H,0, levels 
in atime course (4-10 h) after RGF1 treatment. Broader localization 
of gPLT2-YFP and increased superoxide levels along with lower H,O, 
signals at the distal end of the meristematic zone appeared 6 h after 
treatment (Extended Data Figs. 4a—d, i,j, o, p and 5a—c). At 8 hand 
10 h after treatment, expanded gPLT2-YFP expression and O, signals 
correlated with declining H,O, signals (Extended Data Figs. 4e-h, k-n, 
q-tand 5a-c). Taken together, these results indicate that ROS regulates 
PLT2 protein stability by modulating O, and H,0, levels. 

To further test the hypothesis that the stability of the PLT2 protein 
is enhanced by ROS signalling produced by R/TF1, we overexpressed 
RITF1 in the p/t2 mutant. This produced an increase in the O, signal 
(Extended Data Fig. 9a, b) but was unable to induce an increase in root 
meristem size (Extended Data Fig. 9c, d). Furthermore, we detected 
only asubtle change in root meristem size in p/t2 mutants as compared 
with wild-type roots upon RGF1 treatment (Extended Data Fig. 10a, b). 
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However, we did observe an elevated O, signal (Extended Data Fig. 10c, 
d). These results strongly suggest that ROS signals modulated by R/TF1 
enhance PLT2 stability. Insummary, we have identified anewtranscrip- 
tion factor, RITF1, which is induced by RGF1 inthe meristematic zone. 
This factor controls ROS levels, whichinturn regulate PLT2 stability and 
meristem size. Overall, our data demonstrate a key role for the peptide 
hormone RGF1in regulating root growth via modulation of ROS levels, 
which control the transition from proliferation to differentiation. 
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Methods 


Plant materials and growth conditions 

All Arabidopsis mutants and marker lines used here are in the Colum- 
bia-O (Col-O) background. The transfer (T)-DNA plt2 insertion line 
(SALK_130119.20.25) was obtained from the Arabidopsis Biological 
Resource Center at Ohio State University. The T-DNA insertion was 
found to be 166 base pairs upstream of the transcription start site in the 
plt2 mutant. Seeds were surface-sterilized using 50% (v/v) bleach and 
0.1% Tween 20 (Sigma) for 15 min and then rinsed five times with sterile 
water. All seeds were plated on standard MS medium (1 x Murashige 
and Skoog salt mixture, Caisson Laboratories), 0.5 g I! MES, 1% sucrose 
and 1% agar (Difco) and adjusted to pH 5.7 with KOH. All plated seeds 
were stratified at 4 °C for 2 days before germination. Seedlings were 
grown on vertically positioned square plates in a Percival incubator 
with 16 h of daily illumination at 22 °C. 


The ritf1 mutants 
The ritf1-1 mutant was generated using the egg-cell-specific controlled 
CRISPR-Cas9 system”. 

sgRNA sequences are as follows: RITF1 sgRNA1, GGGATGTCCA 
TACCATGAGA CGG; RITF1 sgRNA2, CCGTCTACCACAGT TGATCG AGG; 
RITF1sgRNA3, GGCGAACT TGAAGGAGTCTA TGG; and RITF1 sgRNA4, 
GACTTTCAGTTGAGTCCTCA TGG. 

The CRISPR construct was transformed into the Col-0 background 
using the Agrobacterium-mediated floral dip method. The mutant was 
identified by direct sequencing of PCR products of the targets in the 
offspringin T1, T2 and T3 generations. The loss-of-function ritf1-1 allele 
contains an insertion of acytosine 74 bp after the transcription start site 
inthe RITF1 gene (771 bp). The additional insertion of a cytosine results 
ina frameshift and creates many premature stop codons after the inser- 
tion. To exclude issues related to off-target mutations, we confirmed 
the sequences of three potential off-target genes (At5g25170, Atlg70110 
and At3g20640) that include similar sequences of the target sites by 
direct sequencing of PCR products in the offspring in the T1, T2 and 
T3 generations. We did not find any mutations in these genes. Further, 
we identified another independent CRISPR allele (ritf1-3). This allele 
contains an insertion of an adenine 75 bp after the transcription start 
site in the R/7F1 gene. The additional insertion of an adenine resultsina 
frameshift and creates many premature stop codons after the insertion. 
Similar to ritf1-1 mutants, ritf1-3 seedlings exhibited strong resistance 
to the RGF1 peptide and did not increase their O, levels by comparison 
with wild-type seedlings or with the weak allele (ritf1-2) (Extended Data 
Fig. 6d, e). These results exclude the possibility that off-target muta- 
tions cause the RGF1-resistant phenotype. 

In the ritf1-2 allele (SALK_081503C), we identified the T-DNA inser- 
tion 787 bp downstream of the transcription start site (in the middle 
of the second intron) of R/ITF1. Eventhough the insertion disrupted an 
intron, a full-length transcript was weakly detected from this allele. 


Detecting gPLT2-YFP and ROS signals 

We grew wild-type and rgfr1/2/3 mutant plants for seven days on MS 
agar plates, then transferred them to MS agar plates containing either 
water (mock treatment) or 20 nM synthetic sulfated RGF1 peptide 
(Invitrogen). After treatment with RGF1, seedlings were stained for 
2 mininasolution of 200 pM NBT in 20 mM phosphate buffer (pH 6.1) 
in the dark and rinsed twice with distilled water. To detect hydrogen 
peroxide with BES-H,O,-Ac”, we incubated seedlings in50 uM BES-H,0,- 
Ac (WAKO) for 30 min in the dark, then mounted themin10 mg mI 'PI 
in water®. Roots were observed using a x20 objective with a Zeiss LSM 
880 laser scanning confocal microscope. Excitation and detection 
windows were set as follows: BES-H,O,-Ac, excitation at 488 nm and 
detection at 500-550 nm; Pl staining, excitation at 561 nm and detection 
at 570-650 nm. Confocal images were processed, stitched and analysed 
using the Fiji package of ImageJ”*. Maximum projection images were 


generated from about 30 z-section images of BES-H,O,-Ac staining. 
The average intensity of BES-H,0,-Ac in the meristematic zone was 
measured in five or six roots with three biological replicates. Images 
for NBT staining were obtained using a x10 objective with a Leica DM 
5000-B light microscope. The total intensities of NBT staining in the 
meristematic zone were measured in ten roots with three biological 
replicates using the Fiji software package”’. 

For experiments with a shorter time course, we grew gPLT2-YFPseed- 
lings“ on MS agar plates for seven days, then transferred them to MS 
agar plates containing either water (mock) or 100 nM RGF1 peptide. 
At 4h, 6h, 8 hand 10 h after mock or RGF1 treatment, images were 
taken with a confocal or light microscope after PI, NBT and BES-H,0,- 
Ac staining, as above. 


Total RNA extraction and library preparation 

The HYP2-GFP" line was grown onMS plates for seven days. HYP2-GFP 
seedlings were then transferred into liquid MS medium and treated 
with water (mock) or 100 nM RGFI peptide in 6-well plates for 1h. After 
mock or RGF1 treatment, the seedlings were taken out of liquid MS 
medium and transferred onto a 2% agarose plate. Using an ophthalmic 
scalpel (Feather), the meristematic zone of the seedlings was precisely 
dissected on the basis of HYP2-GFP fluorescence as detected under a 
dissecting microscope (Axio Zoom, Zeiss). Using the RNeasy Micro Kit 
(Qiagen), we extracted total RNA from 20 root sections treated with 
water (mock) or 100 nM RGF1. For each treatment, three replicates 
of the RNA extractions were performed. All total RNA samples were 
treated with DNase 1 during RNA extraction. RNA quality was examined 
using a 2100 Bioanalyzer (Agilent). The RNA integrity number was more 
than 9.0 in all samples. The concentration of total RNA was measured 
by a Qubit (Invitrogen) instrument. For each replicate, we generated 
complementary DNA (cDNA) from 50 ng total RNA using the Ovation 
RNA-seq System V2 (NUGEN). We fragmented 3 pg of the cDNA using 
the Covaris S-Series System. We used 400 ng of the fragmented cDNA 
with an average size of 400 bp for library preparation with the Ovation 
Ultralow System V2 (NUGEN). Illumina sequencing was performed at 
the Duke Genome Sequencing Shared Resource. The libraries for three 
biological replicates of mock- and RGF1-treated meristematic zones 
were sequenced onan Illumina HiSeq 2000 (100 base paired end reads). 


Differential expression analysis after RGF1 treatment 

Illumina sequencing reads were mapped to the TAIRIO Arabidopsis 
genome using Tophat V2.1.1. The parameters used for mapping were: 
‘“-N 5-read-gap-length 5-read-edit-dist 5—b2-sensitive -r 10O-mate- 
std-dev 150 -p 5 -i5 -115000-min-segment-intron 5-max-segment- 
intron 15000-library-type fr-unstranded’. To select properly mapped 
reads with unique mapping positions, we kept for further analysis only 
those alignments witha flag of 83, 99, 147 or 163 anda mapping quality 
score of 50. Mapping positions of these reads were compared withthe 
Araport11 genome annotation (https://www.araport.org/downloads/ 
Araport11_Release_201606/annotation) using HTseq-count (v0.6.1) with 
parameters ‘-stranded=no-mode=intersection-nonempty’, which gen- 
erated aread count per gene. The rawread counts of microRNAs, long 
non-coding RNAs and protein-coding genes were then used as input 
into DESeq2 (v1.14.1) for differential gene expression analysis. Genes 
with a false discovery rate (FDR)-adjusted P value less than or equal 
to 0.1 were regarded as differentially expressed between the RGF and 
mock treatment scenarios. The enriched Gene Ontology (GO) groups 
among differentially expressed genes were identified using agriGO. The 
GO annotation downloaded from http://geneontology.org was used as 
input for agriGO. Enriched GO groups required an FDR-adjusted Pvalue 
of 0.01 or less and a minimum mapping entry of 10. 


qRT-PCR analysis of RITF1 expression upon RGFI1 treatment 
To perform qRT-PCR, we dissected about 20 meristematic zones of wild- 
type and rgfr1/2/3 mutant roots at 1h, 6 hand 24 hafter RGF1 treatment 
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as described above. We generated cDNA from 10 1g of total RNA using 
SuperScript IV Reverse Transcriptase (Invitrogen). Three biological 
replicates and technical replicates were used for each experiment. 
Standard curves were run for the primer pairs of: R/TF1, 5’°-CAAGCCAT- 
GCCACACTCTAA-3’ and 5’- TTATCCGAGGAAGCTGAGGA-3’; and (as ref- 
erence) PROTEIN PHOSPHATASE 2A SUBUNIT A3 (PP2AA3, AT1G13320), 
5’-GGCCAAAATGATGCAATCTC-3’ and 5’- TGCGAAATACCGAACAT- 
CAA-3’. Expression of R/TF1 was assayed by qRT-PCR on a LightCycler 
480 (Roche) with SYBR-based detection, normalized to PP2AA3, and 
analysed by the efficiency-corrected quantification model. 


Plasmid constructs 

To produce the overexpression line and the transcriptional reporter line 
of RITF1, we amplified the coding sequence (771 bp) or the promoter 
sequence (2,121 bp) of the R/TF1 gene (AT2G12646) using the Phusion 
High-Fidelity DNA polymerase (New England Biolabs) froma wild-type 
cDNA library and genomic DNA, respectively, then subcloned into the 
pENTR/D/TOPO vector (Invitrogen). We used the following primers to 
amplify the coding sequence: 5’-CACCATGGGAAT TCAGAAACCGG-3’ 
and 5’- TTAACAGAGAGGAGATCGTTG-3’; and for the promoter, 5’- CA 
CCGCATCATTTTAT TATAACCCGA-3’ and 5’-GAGGACTCAACTGAA 
AGTCA-3’. We confirmed the sequences of the coding sequence and 
the promoter in the pENTR/D/TOPO vector using Sanger sequencing. 
The clones were recombined into the pMDC7 and pMDC204 vectors” 
using LR clonase Il (Invitrogen) in order to fuse the oestradiol-inducible 
promoter (XVE)® with the coding region of RITF1, the RITF1 promoter 
and GFP with a carboxy-terminus HDEL retention sequence. 


Meristem size and ROS detection after RITF1 overexpression 

We transformed the XVE-RITF1 construct into the wild-type (Col-0) 
background. To measure meristem size and detect ROS signals, we 
grewtwo independent XVE-RITF1 and wild-type lines on MS medium for 
seven days, then transferred them to MS medium containing dimethyl- 
sulfoxide (DMSO, mock) or 10 uM B-oestradiol (Sigma). After 24h with 
mock or oestradiol treatment, we measured meristem size and detected 
ROS signals in the wild-type and XVE-RITF1 lines, as above. 


Expression of pRITF1-GFPin roots 

We introduced the pR/TF1-GFP construct into wild-type (Col-0) and 
rgfr1/2/3 plants. We grew two independent T3 lines of each background 
for seven days in MS medium and treated them with either water (mock) 
or 20 nMRGFI peptide. As described above, 24 hafter treatment, GFP 
signals were detected using a confocal laser scanning microscope. 


Note 

UPB1 is not required for the RGF1-receptor pathway. It has previ- 
ously been reported that UPBEAT1 (UPB1) reduces H,O, levels and 
controls meristem size by downregulating peroxidase genes in the 
elongation zone®. However, our present transcriptome analysis did 
not find substantial changes in UPB1 expression upon RGF1 treat- 
ment (Supplementary Tables 1, 3). We did find elevated expression 


of five peroxidase genes (Supplementary Table 1), but these are not 
targets of UPB1 (ref. °), suggesting that RGF1 regulates meristem size 
independently of UPB1. To determine whether the peroxidase genes 
upregulated by RGFI play a part in controlling meristem size in the 
RGF1-signalling pathway, we overexpressed two of them (At5g39580 
and At4g08780). In neither case did we observe a larger meristematic 
zone (data not shown). 


Statistics and reproducibility 

Experiments were independently repeated three times with similar 
results. No power analysis was done to estimate sample size. The experi- 
ments were not randomized and investigators were not blinded to 
allocation during experiments and outcome assessment. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


All RNA-seq data from this study have been deposited in the National 
Center for Biotechnology Information (NCBI) Gene Expression Omni- 
bus (GEO), with the accession number GSE108730. Source data for 
all graphs have been provided. A previous version of this work was 
deposited in the preprint depository server bioRxiv at https://doi. 
org/10.1101/244947. Source Data for Figs. 1-4 and Extended Data 
Figs. 1,3, 5-10 are provided with the paper. All other data are available 
from the corresponding author upon reasonable request. 
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Extended Data Fig. 1| Expression of meristematic-zone marker and 


transcriptome analysis upon RGF1 treatment. a, b, Confocal images of HPY2- 


GFP roots 24 hafter treatment with water (mock; a) or 20 nM RGFI1 (b). 
Seedlings were grown on MS plates for seven days before treatment. Left, PI- 
stained roots; right, GFP signals. White and blue arrowheads indicate the 
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junction between the meristematic and elongation zones. Scale bar, 50 um. 
c, Area of HPY2-GFP expression (in um?; n= 8 independent roots; P<2.1x10”). 
Bar graphs show means. Error bars are+s.d. Dots indicate each data point. 
Pvalues are calculated by two-sided Student's t-test. d, Method of RNA 
extraction following RGF1treatment. 


Article 


GO Term FDR 
GO:0042221 response to chemical stimulus §.10E-14 
GO:0050896 response to stimulus 1.20E-12 
GO:0042430 indole and derivative metabolic process 1.40E-12 
GO:0005886 plasma membrane 1.90E-12 
GO:0042434 indole derivative metabolic process 1.20E-11 
GO:0051707 response to other organism 1.70E-11 
GO:0010200 __— response to chitin 1.70E-11 
GO:0051704 multi-organism process 7.50E-11 
GO:0009617 response to bacterium 7.60E-11 
GO:0009607 response to biotic stimulus 8.80E-11 
GO:0006952 defense response 7.50E-10 
GO:0006950__— response to stress 2.80E-09 
GO:0042435 _ indole derivative biosynthetic process 3.60E-09 
GO:0042742 defense response to bacterium 8.00E-08 
GO:0006790 sulfur metabolic process 1.00E-07 
GO:0009743 response to carbohydrate stimulus 1.30E-07 
GO:0009404 toxin metabolic process 1.70E-07 
GO:0009407 toxin catabolic process 1.70E-07 
GO:0010033 response to organic substance 2.50E-07 
GO:0019748 secondary metabolic process 2.70E-07 
GO:0005618 cell wall 3.10E-07 
GO:0030312 external encapsulating structure 3.10E-07 
GO:0004364 glutathione transferase activity 4.10E-06 
GO:0003824 catalytic activity 4.10E-06 
GO:0044272 sulfur compound biosynthetic process 5.90E-06 
GO:0016137 glycoside metabolic process 5.90E-06 
GO:0016020 membrane 9.50E-06 
GO:0031224 _ intrinsic to membrane 9.90E-06 
GO:0006725 cellular aromatic compound metabolic process 1.40E-05 
GO:0009725 response to hormone stimulus 1.50E-05 
GO:0009719 response to endogenous stimulus 6.20E-05 
GO:0009636 response to toxin 6.20E-05 
GO:0009505 _plant-type cell wall 6.40E-05 
GO:0030246 carbohydrate binding 7.10E-05 
GO:0019760 glucosinolate metabolic process 0.00011 
0.019 GO:0016143 S-glycoside metabolic process 0.00011 
GO:0019757  glycosinolate metabolic process 0.00011 
GO:0016021 integral to membrane 0.00011 
GO:0023033 signaling pathway 0.00015 
GO:0045087 _ innate immune response 0.00016 
GO:0016740 transferase activity 0.00019 
0.014 GO:0044425 membrane part 0.00021 
GO:0010035 response to inorganic substance 0.00025 
GO:0006955 immune response 0.00027 
GO:0002376 immune system process 0.00027 
GO:0009814 defense response, incompatible interaction 0.0003 
GO:0031226 intrinsic to plasma membrane 0.00036 
0.009 GO:0016491 oxidoreductase activity 0.00039 
GO:0044271 cellular nitrogen compound biosynthetic process 0.00061 
GO:0022804 active transmembrane transporter activity 0.00073 
GO:0022857 transmembrane transporter activity 0.00073 
GO:0016138 glycoside biosynthetic process 0.00076 
0.004 GO:0023052 _ signaling 0.00078 
GO:0022838  substrate-specific channel activity 0.0008 
GO:0019438 aromatic compound biosynthetic process 0.00087 
ad GO:0016765 transferase activity, transferring alkyl groups 0.00093 
Q GO:0022892 substrate-specific transporter activity 0.00093 
5.10E-14 LL GO:0022891 substrate-specific transmembrane transporter activity 0.00093 FDR p<0.001 


Extended Data Fig. 2| GO categories that are enriched uponRGFltreatment. activity (FDR-adjusted P= 0.00039, red). See Supplementary Table 2 (enriched 
These highly significantly enriched GO categories within lists of genes are GO categories upon RGF1 treatment). Pvalues for GO enrichment analysis are 
regulated by RGF1 (FDR-adjusted P< 0.001), and include glutathione based on Fisher’s exact test, with the sample size being all genes inthe genome 
transferase activity (FDR-adjusted P=4.110~°, red) and oxidoreductase and using a Benjamini-Yekutieli FDR for multiple testing correction. 
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Extended Data Fig. 3|H,0, levels after inducible overexpression of RITF1 
and RGF1 treatment. a, Confocal images of H,0,-BES-Ac stained roots, with or 
without XVE-RITF1 expression, ina wild-type (Col-0) background 4 hafter 
treatment with water (mock), 10 pM oestradiol or 1OO nMRGFI1. 
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b, Quantification of H,O,-BES-Ac intensity in the meristematic zone (n=6 
independent samples; *P< 0.0005). Bar graphs show means. Error bars are 
+s.d. Dots indicate each data point. Pvalues are calculated by two-sided 
Student’s t-test. 
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Extended Data Fig. 4 | Localization of gPLT2-YFP, NBT and H,O,-BES-Ac RGF1 (f, 1, r), or 10 hafter treatment with water (mock; g,m,s) or10O0nMRGF1 
staining after RGF1 treatment. a-t, Localization of gPLT2-YFP (a-h), NBT (h,n, t). Blue arrowheads indicate the junction between the meristematic and 
staining (i-n) and H,O,-BES-Ac staining (o-t), 4 hafter treatment with water elongation zones. Scale bar, 50 tm. Seedlings were grown on MS agar plates for 
(mock; a) or 100 nM RGF1 (b), 6 hafter treatment with water (mock;c, i, 0) or seven days before treatment. Experiments were independently repeated three 


100nMRGF1 (d,j, p), 8 hafter treatment with water (mock; e, k, q) or100nM times with similar results. 
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Extended Data Fig. 5 | Time course of gPLT2-YFP localization and NBT and independent roots; P< 0.003). Bar graphs show means. Error bars are +s.d. 
H,0,-BES-Ac staining. a, Distance (in pm) of gPLT2-YFP localization from Dots indicate each data point. Pvalues are calculated by two-sided Student’s 


quiescent-centre cells (n=5 independent roots; P<5.7 x 10°). b, Total intensity t-test. Experiments were independently repeated three times with similar 
of NBT staining in the meristematic zone (n=8 independent roots; P< 0.0003). results. 
c, Average intensity of H,O,-BES-Ac staining in the elongation zone (n=5 
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Extended Data Fig. 6 | Phenotype of rift mutants. a, Root growth (in mm) of 
wild-type (Col), ritf1-1 (CRISPR mutant) and ritf1-2 (SALK line) seedlings from 4 
to 8 days after germination (n =21independent roots). b, Confocal images of 
wild-type, ritf1-1 (CRISPR mutant) and ritf1-2 (Salk line) roots stained with PI. 

c, Percentage increase (in which 100% is the number of cells in the mock-treated 
case) inthe number of cells in the meristematic zone of wild-type, ritf1-1 and 
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independent roots, *P<5.4 x10°°).d, Light microscope images of roots of wild- 
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type, ritf1-3 and ritf1-2 roots stained with NBT 24 hafter treatment with 5nM 
RGF1. Scale bars, 50 pm. Blue arrowheads show the junction between the 
meristematic and elongation zones. e, Quantification of NBT staining intensity 
inthe meristematic zone in wild-type, ritf1-3 and ritf1-2 roots after treatment 
with 5nMRGF1(n=8 independent roots; *P< 0.003). Scale bars, 50 pm. Blue 
and white arrowheads show the junction between the meristematic and 
elongation zones. Bar and line graphs show means. Error bars are+s.d. Dots 
indicate each data point. Pvalues are calculated by two-sided Student’s ¢-test. 
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Extended Data Fig. 7| Expression of pPLT2-CFP and gPLT2-YFP upon RGF1 
treatment. a, b, Confocal images showing pPLT2-CFP expression (cyan; a) and 
gPLT2-YFP expression (green; b) 24 h after treatment with 20 nM RGF1. Red, PI 
staining. Scale bar, 50 um. Arrow heads show the junction between the 
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meristematic and elongation zones. c, Extent (in um) of gPL72-YFP expression 
from quiescent-centre cells (n=5 independent roots; P<2.5 10). Bar graphs 
show means. Error bars are+s.d. Dots indicate each data point. Pvalues are 
calculated by two-sided Student's t-test. 


Article 


20 nM RGF 1 
20 nM RGF1_ {500 uM H202} 500 uM H202 


o Kk 
aX, 
SE 
© £ 3500 
Qa~ ee 
N 2 3000 - 
Fo 
Oo 2 2500 ; ; 
o 2 - 
Paar 
<2 2000 : 
— 
cae = : 
2 9 1500 
Se ; 
© <= 1000 ue : 
o 2 ee ve 
—~ eee 
® @ 500 
oD .N 
© ® 0 
a Mock 20nMRGF1 20nMRGF1 500 uM H202 
r= 500 uM H202 
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treatment. a-d, Confocal images showing gPLT2-YFP expression 24 hafter quiescent-centre cells (n=6,*P< 0.0002). Bar graphs show means. Error bars 
treatment with water (mock), 20 nM RGF1, 20 nM RGF1 with 500 uMH,0,, or are+s.d. Dots indicate each data point. Pvalues calculated by two-sided 
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Extended Data Fig. 9 | Phenotypes resulting from RITF1 overexpression in 
plt2 mutants. a, NBT-stained roots, with or without XVE-RITF1 expression, ina 
wild-type or plt2 background 24 h after treatment with water (mock) or 10 1M 
oestradiol. b, Quantification of NBT staining intensity in the differentiation 
zone with or without XVE-RITF1 in wild-type and p/t2 roots (n=8 independent 
roots; *P<5.4x10°°).c, Confocal images of Pl-stained roots with or without 
XVE-RITF1, ina wild-type or plt2 background, 24 h after mock treatment or 
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treatment with 10 pM oestradiol. d, Number of cells inthe meristematic zone, 
with or without XVE-RITF1, in wild-type and p/t2 roots 24 hafter mock or 10 uM 
oestradiol treatment (n=7 independent roots; *P<4.3 x10~). Scale bars, 

50 um. White and blue arrowheads indicate the junction between the 
meristematic and elongation zones. Bar graphs show means. Error bars are 
+s.d. Dots indicate each data point. Pvalues are calculated by two-sided 
Student’s t-test. 
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Extended Data Fig. 10 | Phenotype of p/t2 roots upon RGF1 treatment. 

a, Confocal images of PI-stained wild-type and plt2 roots 24 hafter treatment 
with water (mock) or 20 nMRGFI. Scale bar, 50 pm. White arrowheads show 
junctions between the meristematic and elongation zones. b, Number of cells 
in the meristematic zone 24 h after mock or 5nMRGF1 treatment (n=6 
independent roots; *P<4.9 x10”).c, Light microscope images of roots from 
wild-type and p/t2 roots stained with NBT. Seedlings were grown on MS agar 
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plates for 7 days before treatment with water (mock) or 20 nMRGF1.d, Total 
intensity of NBT staining in the differentiation zone of wild-type and p/t2 roots 
24 hafter treatment with water (mock) or 20 nM RGF1 (n=8 independent roots; 
*P<0.0003). Scale bars, 50 pm. White arrowheads show the junction between 
the meristematic and elongation zones. Bar graphs show means. Error bars are 
+s.d. Dots indicate each data point. Pvalues are calculated by two-sided 
Student’s ¢-test. 
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Life sciences study design 
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Sample size We used 3 biological replicates for RNA-seq, following the common practice in the field. 
Data exclusions No data were excluded. 
Replication We calculated gene-wise dispersion among biological replicates for RNA-seq data. The dispersion plot displayed typical pattern of RNA-seq. 


Randomization All samples for RNA-seq and measuring ROS and meristem size are randomly selected. 


Blinding For RNA-seq analysis, an investigator randomly collected samples and generated RNA-seq libraries. Another investigator did computational 
analysis. Our transcriptome analysis is completely blind. 
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Screening mammography aims to identify breast cancer at earlier stages of the 
disease, when treatment can be more successful’. Despite the existence of screening 
programmes worldwide, the interpretation of mammograms is affected by high rates 


of false positives and false negatives”. Here we present an artificial intelligence (Al) 
system that is capable of surpassing human experts in breast cancer prediction. To 
assess its performance in the clinical setting, we curated a large representative dataset 
from the UK and a large enriched dataset from the USA. We show an absolute 
reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false 
negatives. We provide evidence of the ability of the system to generalize from the UK 
to the USA. In an independent study of six radiologists, the Al system outperformed 
all of the human readers: the area under the receiver operating characteristic curve 
(AUC-ROC) for the Al system was greater than the AUC-ROC for the average 
radiologist by an absolute margin of 11.5%. We ran a simulation in which the Al system 
participated in the double-reading process that is used in the UK, and found that the 
Alsystem maintained non-inferior performance and reduced the workload of the 
second reader by 88%. This robust assessment of the Al system paves the way for 
clinical trials to improve the accuracy and efficiency of breast cancer screening. 


Breast cancer is the second leading cause of death from cancer in 
women*, but early detection and treatment can considerably improve 
outcomes'**. As aconsequence, many developed nations have imple- 
mented large-scale mammography screening programmes. Major 
medical and governmental organizations recommend screening for 
all women starting between the ages of 40 and 50° ®. Inthe USA and UK 
combined, over 42 million exams are performed each year”. 

Despite the widespread adoption of mammography, interpretation 
of these images remains challenging. The accuracy achieved by experts 
in cancer detection varies widely, and the performance of even the 
best clinicians leaves room for improvement”. False positives 
can lead to patient anxiety’, unnecessary follow-up and invasive 
diagnostic procedures. Cancers that are missed at screening may 
not be identified until they are more advanced and less amenable to 
treatment”*. 

AI may be uniquely poised to help with this challenge. Studies 
have demonstrated the ability of Alto meet or exceed the performance 
of human experts on several tasks of medical-image analysis®”’. 


Asashortage of mammography professionals threatens the availability 
and adequacy of breast-screening services around the world””’, the 
scalability of Al could improve access to high-quality care for all. 
Computer-aided detection (CAD) software for mammography was 
introduced inthe 1990s, and several assistive tools have been approved 
for medical use. Despite early promise”, this generation of software 
failed to improve the performance of readers in real-world settings””””*. 
More recently, the field has seen a renaissance owing to the success 
of deep learning. A few studies have characterized systems for breast 
cancer prediction with stand-alone performance that approaches that 
of human experts”””°. However, the existing work has several limita- 
tions. Most studies are based on small, enriched datasets with limited 
follow-up, and few have compared performance to readers in actual 
clinical practice—instead relying on laboratory-based simulations of the 
reading environment. So far there has been little evidence of the abil- 
ity of Alsystems to translate between different screening populations 
and settings without additional training data”. Critically, the pervasive 
use of follow-up intervals that are no longer than 12 months””*°?8 
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Interpretation Double reading Single reading 
Screening interval 3 years 1 or 2 years 0 
Index exam 
Cancer follow-up 39 months 27 months 
Number of cancers 414 (1.6%) 686 (22.2%) 
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Fig. 1| Development ofan AI system to detect cancer in screening 
mammograms. Datasets representative of the UK and US breast cancer 
screening populations were curated from three screening centres in the UK and 
one centre inthe USA. Outcomes were derived from the biopsy record and 
longitudinal follow-up. An Al system was trained to identify the presence of 
breast cancer from aset of screening mammograms, and was evaluated in three 


means that more subtle cancers that are not identified until the next 
screen may be ignored. 

In this study, we evaluate the performance of a new Al system for 
breast cancer prediction using two large, clinically representative 
datasets from the UK and the USA. We compare the predictions of 
the system to those made by readers in routine clinical practice and 
show that performance exceeds that of individual radiologists. These 
observations are confirmed with an independently conducted reader 
study. Furthermore, we show how this system might be integrated 
into screening workflows, and provide evidence that the system can 
generalize across continents. Figure 1 shows an overview of the project. 


Datasets from cancer screening programmes 


Adeep learning model for identifying breast cancer in screening mam- 
mograms was developed and evaluated using two large datasets from 
the UK and the USA. We report results on test sets that were not used 
to train or tune the Al system. 

The UK test set consisted of screening mammograms that were col- 
lected between 2012 and 2015 from 25,856 women at two screening 
centres in England, where women are screened every three years. It 
included 785 women who hada biopsy, and 414 women with cancer that 
was diagnosed within 39 months of imaging. This was arandom sample 
of 10% of all women with screening mammograms at these sites dur- 
ing this time period. The UK cohort resembled the broader screening 
population in age and disease characteristics (Extended Data Table 1a). 

The test set from the USA, where women are screened every one to 
two years, consisted of screening mammograms that were collected 
between 2001 and 2018 from 3,097 women at one academic medical 
centre. We included images from all 1,511 women who were biopsied 
during this time period and a random subset of women who never 
underwent biopsy (Methods). Among the women who received a 
biopsy, 686 were diagnosed with cancer within 27 months of imaging. 

Breast cancer outcome was determined onthe basis of multiple years 
of follow-up (Fig. 1). We chose the follow-up duration on the basis of 
the screening interval in the country of origin for each dataset. Ina 
similar manner to previous work™, we augmented each interval witha 
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across datasets 


Trained on 
UK training set 


Ground-truth determination 


Positive if biopsy-confirmed 
within T + 3 months 


Otherwise, negative if a second exam 
occurred after T- A 
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Independently conducted 
reader study 
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Tested on J RS 
US test set R6 


6 radiologists read 500 cases 
from US test set 


primary ways: first, Al predictions were compared with the historical decisions 
made in clinical practice; second, to evaluate the generalizability across 
populations, a version of the AI system was developed using only the UK data 
and retested onthe US data; and finally, the performance of the AI system was 
compared to that of six independent radiologists using a subset of the US 

test set. 


three-month buffer to account for variability in scheduling and latency 
of follow-up. Cases that were designated as cancer-positive were accom- 
panied by a biopsy-confirmed diagnosis within the follow-up period. 
Cases labelled as cancer-negative had at least one follow-up non-cancer 
screen; cases without this follow-up were excluded from the test set. 


Retrospective clinical comparison 


We used biopsy-confirmed breast cancer outcomes to evaluate the 
predictions of the Al system as well as the original decisions made by 
readers in the course of clinical practice. Human performance was 
computed on the basis of the clinician’s decision to recall the patient for 
further diagnostic investigation. The receiver operating characteristic 
(ROC) curve of the Al system is shown in Fig. 2. 

In the UK, each mammogram is interpreted by two readers, and 
in cases of disagreement, an arbitration process may invoke a third 
opinion. These interpretations occur serially, such that each reader 
has access to the opinions of previous readers. The records of these 
decisions yield three benchmarks of human performance for cancer 
prediction. 

Compared to the first reader, the Al system demonstrated a statis- 
tically significant improvement in absolute specificity of 1.2% (95% 
confidence interval (CI) 0.29%, 2.1%; P= 0.0096 for superiority) and 
an improvement in absolute sensitivity of 2.7% (95% CI-3%, 8.5%; P= 
0.004 for non-inferiority at a pre-specified 5% margin; Extended Data 
Table 2a). 

Compared to the second reader, the Al system showed non-inferi- 
ority (at a5% margin) for both specificity (P< 0.001) and sensitivity 
(P=0.02). Likewise, the Al system showed non-inferiority (at a5% mar- 
gin) tothe consensus judgment for specificity (P< 0.001) and sensitivity 
(P=0.0039). 

Inthe standard screening protocol inthe USA, each mammogram is 
interpreted by a single radiologist. We used the BI-RADS® score that was 
assigned to each case in the original screening context as a proxy for 
human cancer prediction (see Methods section ‘Interpreting clinical 
reads’). Compared to the typical reader, the Al system demonstrated 
statistically significant improvements in absolute specificity of 5.7% 
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Fig. 2 | Performance of the AI system and clinical readers in breast cancer 
prediction. a, The ROC curve of the Al system on the UK screening data. The AUC 
is 0.889 (95% CI 0.871, 0.907; n= 25,856 patients). Also shown are the sensitivity 
and specificity pairs for the human decisions made in clinical practice. Cases 
were considered positive if they received a biopsy-confirmed diagnosis of cancer 
within 39 months of screening. The consensus decision represents the standard 
of care in the UK, and will involve input from between two and three expert 
readers. The inset shows a magnification of the grey shaded region. Alsystem 
operating points were selected ona separate validation dataset: point i was 
intended to match the sensitivity and exceed the specificity of the first reader; 
points ii and iii were selected to attain non-inferiority for both the sensitivity and 
specificity of the second reader and consensus opinion, respectively. b, The ROC 


(95% C1 2.6%, 8.6%; P< 0.001) and in absolute sensitivity of 9.4% (95% 
C14.5%, 13.9%; P< 0.001; Extended Data Table 2a). 


Generalization across populations 

To evaluate the ability of the Al system to generalize across populations 
and screening settings, we trained the same architecture using only 
the UK dataset and applied it to the US test set (Fig. 2b). Even with- 
out exposure to the US training data, the ROC curve of the Al system 
encompasses the point that indicates the average performance of US 
radiologists. Again, the Al system showed improved specificity (+3.5%, 
P=0.0212) and sensitivity (+8.1%, P= 0.0006; Extended Data Table 2b) 
compared with radiologists. 


Comparison with a reader study 


In a reader study that was conducted by an external clinical research 
organization, six US-board-certified radiologists who were compliant 
with the requirements of the Mammography Quality Standards Act 
(MQSA) interpreted 500 mammograms that were randomly sampled 
from the US test set. Where data were available, readers were equipped 
with contextual information typically available in the clinical setting, 
including the patient’s age, breast cancer history, and previous screen- 
ing mammograms. 

Among the 500 cases selected for this study, 125 had biopsy-proven 
cancer within 27 months, 125 had a negative biopsy within 27 months 
and 250 were not biopsied (Extended Data Table 3). These proportions 
were chosen to increase the difficulty of the screening task and increase 
statistical power. (Such enrichment is typical in observer studies”®.) 

Readers rated each case using the forced BI-RADS” scale, and BI- 
RADS scores were compared to ground-truth outcomes to fit an ROC 
curve for each reader. The scores of the Al system were treated in the 
same manner (Fig. 3). 
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curve of the Al system on the US screening data. When trained on both datasets 
(solid curve), the AUC is 0.8107 (95% CI 0.791, 0.831; n = 3,097 patients). When 
trained on only the UK dataset (dotted curve), the AUC is 0.757 (95% CI 0.732, 
0.780). Also shown are the sensitivity and specificity achieved by radiologists in 
clinical practice using BI-RADS*®. Cases were considered positive if they received 
a biopsy-confirmed diagnosis of cancer within 27 months of screening. Al system 
operating points were chosen, using a separate validation dataset, to exceed the 
sensitivity and specificity of the average reader. Negative cases were upweighted 
to account for the sampling protocol (see Methods section ‘Inverse probability 
weighting’). Extended Data Figure 1 shows an unweighted analysis. See Extended 
Data Table 2a for statistical comparisons of sensitivity and specificity. 


The Al system exceeded the average performance of radiologists 
by a significant margin (change in area under curve (AAUC) = + 0.115, 
95% C1 0.055, 0.175; P= 0.0002). Similar results were observed when 
a follow-up period of one year was used instead of 27 months (Fig. 3c, 
Extended Data Fig. 2). 

In addition to producing a classification decision for the entire case, 
the Alsystem was designed to highlight specific areas of suspicion for 
malignancy. Likewise, the readers in our study supplied rectangular 
region-of-interest (ROI) annotations surrounding concerning findings. 

We used multi-localization receiver operating characteristic 
(mLROC) analysis” to compare the ability of the readers and the Al 
system to identify malignant lesions within each case (see Methods 
section ‘Localization analysis’). 

We summarized each mLROC plot by computing the partial area 
under the curve (pAUC) in the false-positive fraction interval from O 
to 0.1°8 (Extended Data Fig. 3). The Al system exceeded human per- 
formance by a significant margin (ADAUC = +0.0192, 95% CI 0.0086, 
0.0298; P=0.0004). 


Potential clinical applications 


The classifications made by the Al system could be used to reduce the 
workload involved in the double-reading process that is used in the 
UK, while preserving the standard of care. We simulated this scenario 
by omitting the second reader and any ensuing arbitration when the 
decision of the Al system agreed with that of the first reader. In these 
cases, the opinion of the first reader was treated as final. In cases of 
disagreement, the second and consensus opinions were invoked as 
usual. This combination of human and machine results in performance 
equivalent to that of the traditional double-reading process, but saves 
88% of the effort of the second reader (Extended Data Table 4a). 

The Al system could also be used to provide automated, immediate 
feedback in the screening setting. 
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Fig. 3 | Performance of the AI system in breast cancer prediction compared to 
six independent readers. a, Six readers rated each case (n= 465) using the 
six-point BI-RADS scale. A fitted ROC curve for each of the readers is compared to 
the ROC curve of the Al system (see Methods section ‘Statistical analysis’). For 
reference, anon-parametric ROC curve is presented in tandem. Cases were 
considered positive (n = 113) if they received a pathology-confirmed diagnosis of 
cancer within 27 months of the time of screening. Note that this sample of cases 
was enriched for patients who received a negative biopsy result (n= 119), making 
this amore-challenging population for screening. The mean reader AUC was 
0.625 (s.d. 0.032), whereas the AUC for the Al system was 0.740 (95% C1 0.696, 
0.794). The Al system exceeded human performance by a significant margin 
(AAUC = +0.115, 95% C1 0.055, 0.175; P= 0.0002 by two-sided ORH method 

(see Methods section ‘Statistical analysis’)). For results using a12-month interval, 
see Extended Data Fig. 2. b, Pooled results from all six readers from a.c, Pooled 
results (n= 408) from all 6 readers using a12-month interval for cancer 
definition. Cases were considered positive (n = 56) if they received a pathology- 
confirmed cancer diagnosis within one year (Extended Data Table 3). 


To identify normal cases with high confidence, we used a very-low 
decision threshold. For the UK data, we achieved a negative predictive 
value (NPV) of 99.99% while retaining a specificity of 41.15%. Similarly, 
for the US data, we achieved a NPV of 99.90% while retaining a specificity 
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of 34.79%. These data suggest that it may be feasible to dismiss 35-41% 
of normal cases if we allow for one cancer in every 1,000-10,000 nega- 
tive predictions (NPV 99.90-99.99% in USA-UK). By comparison, con- 
sensus double reading in our UK dataset included one cancer in every 
182 cases that were deemed normal. 

To identify cancer cases with high confidence, we used a very-high 
decision threshold. For the UK data, we achieved a positive predictive 
value (PPV) of 85.6% while retaining a sensitivity of 41.2%. Similarly, for 
the US data, we achieved a PPV of 82.4% while retaining a sensitivity of 
29.8%. These data suggest that it may be feasible to rapidly prioritize 
30-40% of cancer cases, with approximately five out of six follow- 
ups leading to a diagnosis of cancer. By comparison, in our study only 
22.8% of UK cases that were recalled by consensus double reading and 
4.9% of US cases that were recalled by single reading were ultimately 
diagnosed with cancer. 


Performance breakdown 


Comparing the errors of the Al system with errors from clinical reads 
revealed many cases in which the AI system correctly identified 
cancer whereas the reader did not, and vice versa (Supplementary 
Table 1). Most of the cases in which only the Al system identified cancer 
were invasive (Extended Data Table 5). On the other hand, cases in 
which only the reader identified cancer were split more evenly between 
in situ and invasive. Further breakdowns by invasive cancer size, 
grade and molecular markers show no clear biases (Supplementary 
Table 2). 

We also considered the disagreement between the Al system and 
the six radiologists that participated in the US reader study. Figure 4a 
shows a sample cancer case that was missed by all six radiologists, 
but correctly identified by the Al system. Figure 4b shows a sample 
cancer case that was caught by all six radiologists, but missed by the Al 
system. Although we were unable to determine clear patterns among 
these instances, the presence of such edge cases suggests potentially 
complementary roles for the Al system and human readers in reaching 
accurate conclusions. 

We compared the performance of the 20 individual readers best 
represented in the UK clinical dataset with that of the Al system (Supple- 
mentary Table 3). The results of this analysis suggest that the aggregate 
comparison presented above is not unduly influenced by any particular 
readers. Breakdowns by cancer type, grade and lesion size suggest no 
apparent difference in the distribution of cancers detected by the Al 
system and human readers (Extended Data Table 6a). 

On the US test set, a breakdown by cancer type (Extended Data 
Table 6b) shows that the sensitivity advantage of the Al system is 
concentrated on the identification of invasive cancers (for example, 
invasive lobular or ductal carcinoma) rather than in situ cancer (for 
example, ductal carcinoma in situ). A breakdown by BI-RADS® breast 
density category shows that performance gains apply equally across 
the spectrum of breast tissue types that is represented in this dataset 
(Extended Data Table 6c). 


Discussion 


In this study we present an Al system that outperforms radiologists ona 
clinically relevant task of breast cancer identification. These results held 
across two large datasets that are representative of different screening 
populations and practices. 

In the UK, the Al system showed specificity superior to that of the 
first reader. Sensitivity at the same operating point was non-inferior. 
Consensus double reading has been shown to improve performance 
compared to single reading”, and represents the current standard 
of care in the UK and many European countries*’. Our system did not 
outperform this benchmark, but was statistically non-inferior to the 
second reader and consensus opinion. 


Fig. 4 | Discrepancies between the Al system and human readers. a, Asample 
cancer case that was missed by all six readers in the US reader study, but 
correctly identified by the Alsystem. The malignancy, outlined in yellow, isa 
small, irregular mass with associated microcalcifications in the lower inner 
right breast. b, Asample cancer case that was caught by all six readers inthe US 
reader study, but missed by the Al system. The malignancy is a dense massin 
the lower inner right breast. Left, mediolateral oblique view; right, 
craniocaudal view. 


Inthe USA, the Al system exhibited specificity and sensitivity superior 
to that of radiologists practising in an academic medical centre. This 
trend was confirmed in an externally conducted reader study, which 
showed that the scores of the Al system stratified cases better than the 
BI-RADS ratings (the standard scale for mammography assessment in 
the USA) that were assigned by each of the six readers. 

Notably, the human readers (both in the clinic and our reader study) 
had access to patient history and previous mammograms when making 
screening decisions. The US clinical readers may have also had access to 
breast tomosynthesis images. By contrast, the Alsystem only processed 
the most recent mammogram. 

These comparisons are not without limitations. Although the UK 
dataset mirrored the nationwide screening population in age and can- 
cer prevalence (Extended Data Table 1a), the same cannot be said of 
the US dataset, which was drawn from a single screening centre and 
enriched for cancer cases. 


By chance, the vast majority of images used in this study were 
acquired on devices made by Hologic. Future research should assess 
the performance of the Al system across a variety of manufacturers in 
amore systematic way. 

In our reader study, all of the radiologists were eligible to interpret 
screening mammograms in the USA, but did not uniformly receive 
fellowship training in breast imaging. It is possible that a higher bench- 
mark for performance could have been obtained with readers who 
were more specialized“. 

To obtain high-quality ground-truth labels, we used extended follow- 
up intervals that were chosen to encompass a subsequent round of 
screening in each country. Although there is some precedent in clini- 
cal trials** and targeted cohort studies”, this step is not usually taken 
during systematic evaluation of Al systems for breast cancer detection. 

In retrospective datasets with shorter follow-up intervals, outcome 
labels tend to be skewed in favour of readers. As they are gatekeepers 
for biopsy, asymptomatic cases will only receive a cancer diagnosis 
ifa mammogram raises the suspicions of a reader. A longer follow- 
up interval decouples the ground-truth labels from reader opinions 
(Extended Data Fig. 4) and includes cancers that may have been initially 
missed by human eyes. 

The use of an extended interval makes cancer prediction a more 
challenging task. Cancers that are diagnosed years later may include 
new growths for which there could be no mammographic evidence in 
the original images. Consequently, the sensitivity values presented 
here are lower than what has been reported for 12-month intervals” 
(Extended Data Fig. 5). 

We present early evidence of the ability of the Al system to generalize 
across populations and screening protocols. We retrained the system 
using exclusively UK data, and then measured performance on unseen 
US data. In this context, the system continued to outperform radiolo- 
gists, albeit by asmaller margin. This suggests that in future clinical 
deployments, the system might offer strong baseline performance, 
but could benefit from fine-tuning with local data. 

The optimal use of the Al system within clinical workflows remains 
to be determined. The specificity advantage exhibited by the system 
suggests that it could help to reduce recall rates and unnecessary biop- 
sies. The improvement in sensitivity exhibited in the US data shows 
that the Al system may be capable of detecting cancers earlier than the 
standard of care. An analysis of the localization performance of the Al 
system suggests it holds early promise for flagging suspicious regions 
for review by experts. Notably, the additional cancers identified by the 
Al system tended to be invasive rather than in situ disease. 

Beyond improving reader performance, the technology described 
here may havea number of other clinical applications. Through simu- 
lation, we suggest how the system could obviate the need for double 
reading in 88% of UK screening cases, while maintaining a similar level 
of accuracy to the standard protocol. We also explore how high-confi- 
dence operating points can be used to triage high-risk cases and dismiss 
low-risk cases. These analyses highlight the potential of this technology 
to deliver screening results in a sustainable manner despite workforce 
shortages in countries such as the UK®. Prospective clinical studies will 
be required to understand the full extent to which this technology can 
benefit patient care. 
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Methods 


Ethical approval 

Use of the UK dataset for research collaborations by both commercial 
and non-commercial organizations received ethical approval (REC ref- 
erence 14/SC/0258). The US data were fully de-identified and released 
only after an Institutional Review Board approval (STU00206925). 


The UK dataset 

The UK dataset was collected from three breast screening sites in the 
UK National Health Service Breast Screening Programme (NHSBSP). 
The NHSBSP invites women aged between 50 and 70 who are regis- 
tered with a general practitioner (GP) for mammographic screening 
every three years. Women who are not registered witha GP, or whoare 
older than 70, can self-refer to the screening programme. In the UK, 
the screening programme uses double reading: each mammogram 
is read by two radiologists, who are asked to decide whether to recall 
the woman for additional follow-up. When there is disagreement, an 
arbitration process takes place. 

The data were initially compiled by OPTIMAM (Cancer Research UK) 
between 2010 and 2018, from St George’s Hospital (London), Jarvis 
Breast Centre (Guildford) and Addenbrooke’s Hospital (Cambridge). 
The collected data included screening and follow-up mammograms 
(comprising mediolateral oblique and craniocaudal views of the left and 
right breasts), all radiologist opinions (including the arbitration result, 
if applicable) and the metadata associated with follow-up treatment. 

The mammograms and associated metadata of 137,291 women were 
considered for inclusion in the study. Of these, 123,964 women had 
screening images and uncorrupted metadata. Exams that were recalled 
for reasons other than radiographic evidence of malignancy, or epi- 
sodes that were not part of routine screening, were excluded. In total, 
121,850 women had at least one eligible exam. Women who were below 
the age of 47 at the time of the screen were excluded from validation 
and test sets, leaving 121,455 women. Finally, women for whom there 
was no exam with sufficient follow-up were excluded from validation 
and test sets. This last step resulted in the exclusion of 5,990 of 31,766 
test-set cases (19%); see Supplementary Fig. 1. 

The test set is a random sample of 10% of all women who were 
screened at two sites (St George’s Hospital and Jarvis Breast Centre) 
between 2012 and 2015. Insufficient data were provided to apply the 
sampling procedure to the third site. In assembling the test set, we 
randomly selected a single eligible screening mammogram from the 
record of each woman. For women witha positive biopsy, eligible mam- 
mograms were those conducted in the 39 months before the date of 
biopsy. For women who never hada positive biopsy, eligible mammo- 
grams were accompanied by a non-suspicious mammogram at least 
21 months later. 

The final test set consisted of 25,856 women (see Supplementary 
Fig. 1). When compared to the UK national breast cancer screening 
service, we observed avery similar distribution of cancer prevalence, 
age and, cancer type (see Extended Data Table 1a). Digital mammo- 
grams were acquired predominantly on devices manufactured by 
Hologic (95%), followed by General Electric (4%) and Siemens (1%). 


The US dataset 

The US dataset was collected from Northwestern Memorial Hospital 
(Chicago) between 2001 and 2018. Inthe USA, each screening mammo- 
gramis typically read by a single radiologist, and screens are conducted 
annually or biannually. The breast radiologists at this hospital receive 
fellowship training and only interpret breast-imaging studies. Their 
experience levels ranged from 1 to 30 years. The American College of 
Radiology (ACR) recommends that women start routine screening at 
the age of 40; other organizations, including the United States Preven- 
tive Services Task Force (USPSTF), recommend that screening begins 
at the age of 50 for women with an average risk of breast cancer® ®. 


The US dataset included records from all women that underwent a 
breast biopsy between 2001 and 2018. It also included a random sam- 
ple of approximately 5% of all women who participated in screening, 
but were never biopsied. This heuristic was used in order to capture 
all cancer cases (to enhance statistical power) and to curate a rich set 
of benign findings on which to train and test the Al system. The data- 
processing steps involved in constructing the dataset are summarized 
in Supplementary Fig. 2. 

Among women witha completed mammogram order, we collected 
records from all women with a pathology report that contained the 
term ‘breast’. Among women that lacked such a pathology report, 
those whose records bore an International Classification of Diseases 
(ICD) code indicative of breast cancer were excluded. Approximately 
5% of this unbiopsied negative population was sampled. After de- 
identification and transfer, women were excluded if their metadata 
were unavailable or corrupted. The women in the dataset were split 
randomly among train (55%), validation (15%) and test (30%) sets. For 
testing, a single case was chosen for each woman, following a similar 
procedure as for the UK dataset. In women who underwent biopsy, 
we randomly chose a case from the 27 months preceding the date of 
biopsy. For women who did not undergo biopsy, one screening mam- 
mogram was randomly chosen from among those with a follow-up 
event at least 21 months later. 

Cases were considered complete if they possessed the four standard 
screening views (mediolateral oblique and craniocaudal views of the 
left and right breasts), acquired for screening intent. Again, the vast 
majority of the studies were acquired using Hologic (including Lorad- 
branded) devices (99%); the other manufacturers (Siemens and General 
Electric) together constituted less than 1% of studies. 

The radiology reports associated with cases in the test set were used 
to flag and exclude cases that involved breast implants or were recalled 
for technical reasons. To compare the Al system against the clinical 
reads performed at this site, we employed clinicians to manually extract 
BI-RADS scores from the original radiology reports. There were some 
cases for which the original radiology report could not be located, 
even if a subsequent cancer diagnosis was confirmed by biopsy. This 
might have happened, for example, ifthe screening case was imported 
from an outside institution. Such cases were excluded from the clinical 
reader comparison. 


Randomization and blinding 

Patients were randomized into training, validation, and test sets by 
applying a hash function to the de-identified medical record number. 
Set assignment was based on the value of the resulting integer modulo 
100. For the UK data, values of O-9 were reserved for the test set. For the 
US data, values of 0-29 were reserved for the test set. Test set sizes were 
chosen to produce, in expectation, a sufficient number of positives to 
power statistical comparisons on the metric of sensitivity. 

The US and UK test sets were held back from Al system development, 
which only took place on the training and validation sets. Investiga- 
tors did not access test set data until models, hyperparameters, and 
operating point thresholds were finalized. None of the readers who 
interpreted the images had knowledge of any aspect of the Al system. 


Inverse probability weighting 
The US test set includes images from all biopsied women, but only a 
random subset of women who never underwent biopsy. This enrich- 
ment allowed us to accrue more positives in light of the low baseline 
prevalence of breast cancer, but led to underrepresentation of normal 
cases. We accounted for this sampling process by using inverse prob- 
ability weighting to obtain unbiased estimates of human and Al system 
performance in the screening population***. 

We acquired images from 7,522 of the 143,238 women who underwent 
mammography screening but had no cancer diagnosis or biopsy record. 
Accordingly, we upweighted cases from women who never underwent 
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biopsy by a factor of 19.04. Further sampling occurred when selecting 
one case per patient: to enrich for difficult cases, we preferentially 
chose cases from the timeframe preceding a biopsy (if one occurred). 
Although this sampling increases the diversity of benign findings, it 
again shifts the distribution from what would be observed in atypical 
screening interval. To better reflect the prevalence that results when 
negative cases are randomly selected, we estimated additional factors 
by Monte Carlo simulation. Choosing one case per patient with our 
preferential sampling mechanism yielded 872 cases that were biopsied 
within 27 months, and 1,662 cases that were not (Supplementary Fig. 2). 
However, 100 trials of pure random sampling yielded on average 557.54 
and 2,056.46 cases, respectively. Accordingly, cases associated with 
negative biopsies were downweighted by 557.54/872 = 0.64. Cases that 
were not biopsied were upweighted by another 2,056.46/1,662 =1.24, 
leading to a final weight of 19.04 x 1.24 = 23.61.Cancer-positive cases 
carried a weight of 1.0. The final sample weights were used in sensitiv- 
ity, specificity and ROC calculations. 


Histopathological outcomes 

Inthe UK dataset, benign and malignant classifications (given directly 
inthe metadata) followed NHSBSP definitions**. To derive the outcome 
labels for the US dataset, pathology reports were reviewed by US-board- 
certified pathologists and categorized according to the findings they 
contained. An effort was made to harmonize this categorization with UK 
definitions. Malignant pathologies included ductal carcinomain situ, 
microinvasive carcinoma, invasive ductal carcinoma, invasive lobular 
carcinoma, special-type invasive carcinoma (including tubular, muci- 
nous and cribriform carcinomas), intraductal papillary carcinoma, 
non-primary breast cancers (including lymphoma and phyllodes) and 
inflammatory carcinoma. Women whoreceived a biopsy that found any 
of these malignant pathologies were considered to have a diagnosis 
of cancer. 

Benign pathologies included lobular carcinoma in situ, radial scar, 
columnar cell changes, atypical lobular hyperplasia, atypical ductal 
hyperplasia, cyst, sclerosing adenosis, fibroadenoma, papilloma, peri- 
ductal mastitis and usual ductal hyperplasia. None of these findings 
were considered to be cancerous. 


Interpreting clinical reads 

In the UK screening setting, readers categorize mammograms from 
asymptomatic women as normal or abnormal, with a third option 
for technical recall owing to inadequate image quality. An abnormal 
result at the conclusion of the double-reading process results in further 
diagnostic assessment. We treat mammograms deemed abnormal as 
a prediction of malignancy. Cases in which the consensus judgment 
recalled the patient for technical reasons were excluded from analysis, 
as the images were presumed to be incomplete or unreliable. Cases in 
which any single reader recommended technical recall were excluded 
from the corresponding reader comparison. 

In the US screening setting, radiologists attach a BI-RADS” score 
to each mammogram. A score of 0 is deemed ‘incomplete’, and will 
later be refined on the basis of follow-up imaging or repeat mammog- 
raphy to address technical issues. For computation of sensitivity and 
specificity, we dichotomized the BI-RADS assessments in line with 
previous work**. Scores of 0, 4 and 5 were treated as positive predic- 
tions if the recommendation was based on mammographic findings, 
not ontechnical grounds or patient symptoms alone. Cases of technical 
recall were excluded from analysis, as the images were presumed to be 
incomplete or unreliable. BI-RADS scores were manually extracted from 
the free-text radiology reports. Cases for which the BI-RADS score was 
unavailable were excluded from the reader comparison. 

In both datasets, the original readers had access to contextual infor- 
mation that is normally available in clinical practice. This includes 
the patient’s family history of cancer, prior screening and diagnostic 
imaging, and radiology or pathology notes from past examinations. 


By contrast, only the age of the patient was made available to the Al 
system. 


Overview of the Al system 

The Al system consisted of an ensemble of three deep learning mod- 
els, each operating ona different level of analysis (individual lesions, 
individual breasts and the full case). Each model produces a cancer 
risk score between O and 1 for the entire mammography case. The final 
prediction of the system was the mean of the predictions from the 
three independent models. A detailed description of the Al system is 
available in the Supplementary Methods and Supplementary Fig. 3. 


Selection of operating points 

The Al system natively produces a continuous score that represents the 
likelihood of cancer being present. To support comparisons with the 
predictions of human readers, we thresholded this score to produce 
analogous binary screening decisions. For each clinical benchmark, 
we used the validation set to choose a distinct operating point; this 
amounts to a score threshold that separates positive and negative 
decisions. To better simulate prospective deployment, the test sets 
were never used in selecting operating points. 

The UK dataset contains three clinical benchmarks—the first reader, 
second reader and consensus. This last decision is the outcome of the 
double-reading process and represents the standard of care in the 
UK. For the first reader, we chose an operating point aimed at dem- 
onstrating statistical superiority in specificity and non-inferiority for 
sensitivity. For the second reader and consensus reader, we chose an 
operating point aimed at demonstrating statistical non-inferiority for 
both sensitivity and specificity. 

The US dataset contains a single operating point for comparison, 
whichcorresponds to the radiologist using the BI-RADS rubric for evalu- 
ation. In this case, we used the validation set to choose an operating 
point aimed at achieving superiority for both sensitivity and specificity. 


Reader study 

For the reader study, six US-board-certified radiologists interpreted 
asample of 500 cases from 500 women in the test set. All radiologists 
were compliant with MQSA requirements for interpreting mammog- 
raphy and had an average of 10 years of clinical experience (Extended 
Data Table 7b). Two of them were fellowship-trained in breast imaging. 
The sample of cases was stratified to contain 50% normal cases, 25% 
biopsy-confirmed negative cases and 25% biopsy-confirmed positive 
cases. A detailed description of the case composition of the reader study 
can be found in Extended Data Table 3. Readers were not informed of 
the enrichment levels in the dataset. 

Readers recorded their assessments on a 21CFR11-compliant elec- 
tronic case report form within the Ambra Health (New York, NY) viewer 
v3.18.7.0R. They interpreted the images using SMP MSQA-compliant 
displays. Each reader interpreted the cases in a unique randomized 
order. 

For each study, readers were asked to first report a BI-RADS® 5th 
edition score using the values O, 1and 2, as if they were interpreting the 
screening mammogram in routine practice. They were then asked to 
render a forced diagnostic BI-RADS score using the values 1, 2, 3, 4A, 4B, 
4C or 5. Readers also gave a finer-grained score between 0 and 100 that 
was indicative of their suspicion that the case contains a malignancy. 

In addition to the four standard mammographic screening images, 
clinical context was provided to better simulate the screening set- 
ting. Readers were presented with the preamble of the de-identified 
radiology report that was produced by the radiologist who originally 
interpreted the study. This contained information suchas the age of the 
patient and their family history of cancer. The information was manu- 
ally reviewed to ensure that no impression or findings were included. 

Where possible (in 43% of cases), previous imaging was made avail- 
able to the readers. Readers could review up to four sets of previous 


screening exams that were acquired between 1 and 4 years earlier, 
accompanied by de-identified radiologist reports. If prior imaging 
was available, the study was read twice by each reader-—first without the 
prior information, and then immediately after, with the prior informa- 
tion present. The system ensured that readers could not update their 
initial assessment after the prior information was presented. For cases 
for which previous exams were available, the final reader assessment 
(given after having reviewed the prior exams) was used for the analysis. 

Cases in which at least half of the readers indicated concerns with 
image quality were excluded from the analysis. Cases in which breast 
implants were noted were also excluded. The final analysis was per- 
formed on the remaining 465 cases. 


Localization analysis 

For this purpose, we considered all screening exams from the reader 
study for which cancer developed within 12 months. See Extended Data 
Table 3 for a detailed description of how the dataset was constructed. 
To collect ground-truth localizations, two board-certified radiologists 
inspected each case, using follow-up data to identify the location of 
malignant lesions. Instances of disagreement were resolved by one 
radiologist with fellowship training in breast imaging. To identify the 
precise location of the cancerous tissue, radiologists consulted sub- 
sequent diagnostic mammograms, radiology reports, biopsy notes, 
pathology reports and post-biopsy mammograms. Rectangular bound- 
ing boxes were drawn around the locations of subsequent positive 
biopsies in all views in which the finding was visible. In cases in which no 
mammographic finding was visible, the location where the lesion later 
appeared was highlighted. Of the 56 cancers considered for analysis, 
location information could be obtained with confidence in 53 cases; 
three cases were excluded owing to ambiguity in the index examina- 
tion and the absence of follow-up images. On average, there were 2.018 
ground-truth regions per cancer-positive case. 

In the reader study, readers supplied rectangular ROI annotations 
surrounding suspicious findings in all cases to which they assigned a 
BI-RADS score of 3 or higher. A limit of six ROIs per case was enforced. 
On average, the readers supplied 2.04 annotations per suspicious case. 
In addition to an overall cancer likelihood score, the Al system produces 
aranked list of rectangular bounding boxes for each case. To conduct 
a fair comparison, we allowed only the top two bounding boxes from 
the Al system to match the number of ROIs produced by the readers. 

To compare the localization performance of the Al system with that of 
the readers, we used a method inspired by location receiver operating 
characteristic (LROC) analysis®”. LROC analysis differs from traditional 
ROC analysis in that the ordinate is a sensitivity measure that factorsin 
localization accuracy. Although LROC analysis traditionally involves a 
single finding per case*”“’, we permitted multiple unranked findings to 
match the format of our data. We use the term multi-localization ROC 
analysis (mLROC) to describe our approach. For each threshold, acan- 
cer case was considered a true positive if its case-wide score exceeded 
this threshold and at least one culprit area was correctly localized in 
any of the four mammogram views. Correct localization required an 
intersection-over-union (IoU) of 0.1 with the ground-truth ROI. False 
positives were defined as usual. 

CAD systems are often evaluated on the basis of whether the centre 
of their marking falls within the boundary of a ground-truth annota- 
tion**. This is potentially problematic as it does not properly penalize 
predicted bounding boxes that are so large as to be non-specific, but 
whose centre nevertheless happens to fall within the target region. Simi- 
larly, large ground-truth annotations associated with diffuse findings 
might be overly generous to the CAD system. We prefer the loU metric 
because it balances these considerations. We chose a threshold of 0.1to 
account for the fact that indistinct margins on mammography findings 
lead to ROI annotations of vastly different sizes depending on subjec- 
tive factors of the annotator (see Supplementary Fig. 4). Similar work 
in three-dimensional chest computed tomography" used any pixel 


overlap to qualify for correct localization. Likewise, an FDA-approved 
software device for the detection of wrist fractures reports statistics 
in which true positives require at least one pixel of overlap”. An loU 
value of 0.1is strict by these standards. 


Statistical analysis 

To evaluate the stand-alone performance of the Al system, the AUC- 
ROC was estimated using the normalized Wilcoxon (Mann-Whitney) 
Ustatistic®°. This is the standard non-parametric method used by most 
modern software libraries. For the UK dataset, non-parametric confi- 
dence intervals on the AUC were computed with DeLong’s method”. 
For the US dataset, in which each sample carried a scalar weight, the 
bootstrap was used with 1,000 replications. 

For both datasets, we compared the sensitivity and specificity of the 
readers with that of a thresholded score from the AI system. For the 
UK dataset, we knew the pseudo-identity of each reader, so statistics 
were adjusted for the clustered nature of the data using Obuchowski’s 
method for paired binomial proportions****. Confidence intervals on 
the difference are Wald intervals» and a Wald test was used for non- 
inferiority®’. Both used the Obuchowski variance estimate. 

For the US dataset, in which each sample carried a scalar inverse 
probability weight*, we used resampling methods” to compare the 
sensitivity and specificity of the Al system with those of the pool of 
radiologists. Confidence intervals on the difference were generated 
with the bootstrap method with 1,000 replications. A P value on the 
difference was generated through the use of a permutation test™. In 
each of 10,000 trials, the reader and Al system scores were randomly 
interchanged for each case, yielding a reader—Al system difference 
sampled fromthe null distribution. A two-sided Pvalue was computed 
by comparing the observed statistic to the empirical quantiles of the 
randomization distribution. 

Inthe reader study, each reader graded each case using a forced BI- 
RADS protocol (ascore of 0 was not permitted), and the resulting values 
were treated as a 6-point index of suspicion for malignancy. Scores of 
land 2 were collapsed into the lowest category of suspicion; scores 
3, 4a, 4b, 4c and 5 were treated independently as increasing levels of 
suspicion. Because none of the BI-RADS operating points reached the 
high-sensitivity regime (see Fig. 3), to avoid bias from non-parametric 
analysis” we fitted parametric ROC curves to the data using the proper 
binormal model“. This issue was not alleviated by using the readers’ 
ratings for their suspicion of malignancy, which showed very strong 
correspondence with the BI-RADS scores (Supplementary Fig. 5). As 
BI-RADS is used in actual screening practice, we chose to focus on these 
scores for their superior clinical relevance. In a similar fashion, we 
fitted a parametric ROC curve to discretized Al system scores on the 
same data. 

The performance of the Al system was compared to that of the panel 
of radiologists using methods for the analysis of multi-reader multi- 
case (MRMC) studies that are standard in the radiology community. 
More specifically, we compared the AUC-ROC and pAUC-mLROC 
for the Al system to those of the average radiologist using the ORH 
procedure’, Originally formulated for the comparison of multiple 
imaging modalities, this analysis has been adapted to the setting in 
which the population of radiologists operate ona single modality and 
interest lies in comparing their performance to that of a stand-alone 
algorithm”. The jackknife method was used to estimate the covariance 
terms inthe model. Computation of Pvalues and confidence intervals 
was conducted in Python using the numpy and scipy packages, and 
benchmarked against a reference implementation in the RJafroc library 
for the R computing language (https://cran.r-project.org/web/pack- 
ages/RJafroc/index.html). 

Our primary comparisons numbered seven in total: sensitivity and 
specificity for the UK first reader; sensitivity and specificity for the US 
clinical radiologist; sensitivity and specificity for the US clinical radiolo- 
gist against a model trained using only UK data; and the AUC-ROC in 
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the reader study. For comparisons with the clinical reads, the choice 
of superiority or non-inferiority was based on what seemed attainable 
from simulations conducted on the validation set. For non-inferior- 
ity comparisons, a 5% absolute margin was pre-specified before the 
test set was inspected. We used a statistical significance threshold of 
0.05. All seven P values survived correction for multiple comparisons 
using the Holm-Bonferroni method™. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


The dataset from Northwestern Medicine was used under license for 
the current study, and is not publicly available. Applications for access 
tothe OPTIMAM database can be made at https://medphys.royalsurrey. 
nhs.uk/omidb/getting-access/. 


Code availability 


The code used for training the models has a large number of dependen- 
cies on internal tooling, infrastructure and hardware, andits release is 
therefore not feasible. However, all experiments and implementation 
details are described in sufficient detail inthe Supplementary Methods 
section to support replication with non-proprietary libraries. Several 
major components of our work are available in open source reposi- 
tories: Tensorflow (https://www.tensorflow.org); Tensorflow Object 
Detection API (https://github.com/tensorflow/models/tree/master/ 
research/object_detection). 
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Extended Data Fig. 1| Unweighted evaluation of breast cancer predictionon overrepresented, the specificity of both the Al system and the human readers is 
the US test set. In contrast to in Fig. 2b, the sensitivity and specificity were reduced. The unweighted human sensitivity and specificity are 48.10% (n=553) 
computed without the use of inverse probability weights to account for the and 69.65% (n=2,185), respectively. 

spectrum enrichment of the study population. Because hard negatives are 
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Extended Data Fig. 2| Performance of the AI system in breast cancer this sample of cases was enriched for patients who had received a negative 
prediction compared tosix independent readers, witha12-monthfollow-up —_ biopsy result (n=119), making it a more challenging population for screening. 
interval for cancer-positive status. Whereas the mean reader AUC was 0.750 As these external readers were not gatekeepers for follow-up and eventual 
(s.d.0.049), the Al system achieved an AUC of 0.871 (95% CI 0.785,0.919). The AI — cancer diagnosis, there was no bias in favour of reader performance at this 
system exceeded human performance by asignificant margin (AAUC = +0.121, shorter time horizon. See Fig. 3a for acomparison witha time interval that was 
95% CI 0.070, 0.173; P= 0.0018 by two-sided ORH method). In this analysis, chosen to encompass a subsequent screening exam. 


there were 56 positives of 408 total cases; see Extended Data Table 3. Note that 
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Extended Data Fig. 3| Localization (mLROC) analysis. Similar to Extended used as the right-hand boundary for the pAUC calculation. The mean reader 
Data Fig. 2, but true positives require localization of a malignancy in any of the pAUC was 0.029 (s.d. 0.005), whereas that of the AI system was 0.048 (95% Cl 
four mammogram views (see Methods section ‘Localization analysis’). Here, 0.035, 0.061). The Al system exceeded human performance by asignificant 
the cancer interval was 12 months (n=53 positives of 405 cases; see Extended margin (ApAUC = +0.0192, 95% C1 0.0086, 0.0298; P= 0.0004 by two-sided 
Data Table 3). The dotted line indicates a false-positive rate of 10%, which was ORH method). 
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Extended Data Fig. 4 | Evidence for the gatekeeper effect in retrospective 
datasets. a, b, Graphs show the change in observed reader sensitivity in the 

UK (a) and the USA (b) as the cancer follow-up interval is extended. At short 
intervals, measured reader sensitivity is extremely high, owing to the fact that 
biopsies are only triggered based on radiological suspicion. As the time 
interval is extended, the task becomes more difficult and measured sensitivity 
declines. Part of this decline stems from the development of new cancers that 
were impossible to detect at the initial screening. However, steeper drops 
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occur when the follow-up window encompasses the screening interval (36 
months in the UK; 12 and 24 months in the USA). This is suggestive of what 
happens to reader metrics when gatekeeper bias is mitigated by another 
screening examination. In both graphs, the number of positives grows as the 
follow-up interval is extended. In the UK dataset (a), itincreases fromn=259 
within 3 months ton=402 within 39 months. Inthe US dataset (b), it increases 
from n=221 within n=3 months to 553 within 39 months. 
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Extended Data Fig. 5| Quantitative evaluation of reader and Alsystem 
performance with a12-month follow-up interval for ground-truth cancer- 
positive status. Because a12-month follow-up interval is unlikely to 
encompass a subsequent screening exam in either country, reader-model 
comparisons on retrospective clinical data may be skewed by the gatekeeper 
effect (Extended Data Fig. 4). See Fig. 2 for comparison with longer time 
intervals. a, Performance of the AI system on UK data. This plot was derived 
froma total of 25,717 eligible examples, including 274 positives. The Alsystem 
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achieved an AUC of 0.966 (95% C1 0.954, 0.977). b, Performance of the Alsystem 
onUS data. This plot was derived froma total of 2,770 eligible examples, 
including 359 positives. The Al system achieved an AUC of 0.883 (95% C10.859, 
0.903). c, Reader performance. When computing reader metrics, we excluded 
cases for which the reader recommended repeat mammography to address 
technical issues. In the US data, the performance of radiologists could only be 
assessed onthe subset of cases for which a BI-RADS grade was available. 
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Extended Data Table 1| Characteristics of the UK and US test sets 


a 
UK test set Cl at 95% NHS BSP 
Years 2012 to 2015 - 2011 to 2016 
Sources 2 sites in the UK - All UK screening sites 
No. women 25,856 - 10,257,551 
No. normals 25,588 (99.0%) (98.8, 99.1) 10,171,074 (99.1%) 
No. cancers 268 (1.0%) (0.9, 1.2) 86,477 (0.8%) 
Recall rate 1,235 (4.8%) (4.5, 5.1) 427,457 (4.2%) 
45-49 1,707 (6.6%) (6.2, 7.1) 832,883 (8.1%) 
50 - 52 4,399 (17.1%) (16.4, 17.7) 1,487,366 (14.5%) 
53 — 54 2,742 (10.6%) (10.1, 11.1) 944,823 (9.2%) 
Age 55 — 59 6,034 (23.3%) (22.6, 24.0) 2,139,701 (20.9%) 
60 — 64 5,457 (21.1%) (20.4, 21.8) 2,044,746 (19.9%) 
65 — 70 4,575 (17.7%) (17.0, 18.3) 2,217,947 (21.6%) 
>= 70 942 (3.6%) (3.3, 4.0) 590,085 (5.8%) 
Invasive 204 (76.1%) (69.5, 81.8) 68,006 (78.6%) 
Ginker tps Non-invasive 58 (21.6%) (16.2, 28.1) 17,733 (20.5%) 
Micro-invasive - - 654 (0.8%) 
Unknown 6 (2.2%) (0.9, 5.6) 84 (0.1%) 
< 10mm 41 (20.1%) (13.7, 28.3) 17,242 (25.4%) 
10-— 15mm 44 (21.6%) (15.3, 30.4) 17,745 (26.1%) 
Cancer size 15 — 20mm 39 (19.1%) (12.9, 27.2) 12,864 (18.9%) 
(Invasive only) | 20— 50mm 61 (29.9%) (22.1, 38.7) 16,316 (24.0%) 
>= 50mm 13 (6.4%) (3.1, 12.4) 1,527 (2.3%) 
Unknown 6 (2.9%) (1.0, 7.9) 2,312 (3.4%) 
b 
US test set Cl at 95% US BCSC 
Years 2001 to 2018 - 2007 to 2013 
Sources 1 US medical center - 6 BCSC registries 
No. women 3,097 - 1,682,504 
No. normals 2,738 (88.4%) (87.2, 89.8) 1,672,692 (99.4%) 
No. cancers 359 (11.6%) (10.2, 12.8) 9,812 (0.6%) 
Recall rate 929 (30.0%) (18.4, 21.5) 195,170 (11.6%) 
< 40 181 (5.9%) (4.8, 7.1) 41,479 (2.5%) 
40-49 1,259 (40.8%) (38.6, 43.0) 448,587 (26.7%) 
Age 50 — 59 800 (26.1%) (24.1, 28.1) 505,816 (30.1%) 
60 — 69 598 (19.0%) (17.3, 20.9) 396,943 (23.6%) 
>= 70 259 (8.2%) (7.0, 9.5) 289,679 (17.3%) 
Invasive 240 (66.9%) (60.5, 72.1) 5,885 (69.0%) 
Cancer type DCIS 100 (27.9%) (22.8, 33.9) 2,644 (31.0%) 
Other 19 (5.3%) (3.2, 8.9) - 


For each feature, we constructed a joint 95% confidence interval on the proportions in each category. a, The UK test set was drawn from two sites in the UK over a four-year period. For reference, 
we present the corresponding statistics from the broader UK NHSBSP®. For comparison with national numbers, only cancers that were detected by screening are reported here. b, The US test 
set was drawn from one academic medical centre over an eighteen-year period. For reference, we present the corresponding statistics from the broader US screening population, as reported 
by the Breast Cancer Surveillance Consortium (BCSC)?. Cancers reported here occurred within 12 months of screening. 

DCIS, ductal carcinoma in situ. 


Extended Data Table 2 | Detailed comparison between human clinical decisions and Al predictions 


a 


test clinical Al 
wot ae ‘3 ‘i ‘ 7 : 
dataeet |benchinark decision |decision| A (%) 95% Cl(%) | p-value | comparison 
(%) (%) 
sensitivity | 62.69 65.42 2.70 
first reader 
specificity | 92.93 94.12 
second sensitivity | 69.40 69.40 
reader | specificity | 92.97 | 92.13 | -0.84 


40 
(0.29, 2.08) | 0.0096 25,115 
sensitivity | 67.39 | 68.12 | 0.72 | (-3.49, 4.94) | 0.0039 
consensus 
specificity | 96.24 | 96.24 | -3.35 | (-4.06, -2.63) 25,442 


UK 

sensitivity | 48.10 57.50 (4.45, 13.85) | 0.0004 | superiority 
USA reader 

specificity | 80.83 86.53 5.70 (2.62, 8.64) | 0.0002 | superiority | 2,185 


48.10 | 56.24 | 8.14 | (3.54, 12.5) | 0.0006 | superiority 
USA reader 
specificity | 80.83 | 84.29 (0.6, 5.98) | 0.0212 2,185 


a, Comparison of sensitivity and specificity between human benchmarks (derived retrospectively from the clinical record) and the predictions of the Al system. Score thresholds were chosen, 
on the basis of separate validation data, to match or exceed the performance of each human benchmark (see Methods section ‘Selection of operating points’). These points are depicted graphi- 
cally in Fig. 2. Note that the number of cases (N) differs from Fig. 2 because the opinion of the radiologist was not available for all images. We also note that sensitivity and specificity metrics 

are not easily comparable to most previous publications in breast imaging (for example, the DMIST Trial™), given the differences in follow-up interval. Negative cases in the US dataset were 
upweighted to account for the sampling protocol (see Methods section ‘Inverse probability weighting’). b, Same columns as a, but using a version of the Al system that was trained exclusively 
on the UK dataset. It was tested on the US dataset to show generalizability of the Al across populations and healthcare systems. Superiority comparisons on the UK data were conducted using 
Obuchowski’s extension of the two-sided McNemar test for clustered data. Non-inferiority comparisons were Wald tests using the Obuchowski correction. Comparisons on the US data were 
performed with a two-sided permutation test. All P values survived correction for multiple comparisons (see Methods section ‘Statistical analysis’). Quantities in bold represent estimated differ- 
ences that are statistically significant for superiority; all others are statistically non-inferior at a pre-specified 5% margin. 


(-3.0,8.5) | 0.0043 noninferiority | 402 
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Extended Data Table 3 | Detailed description of the case composition for the reader study 


No. No. biopsied 
Description cancer negative 
cases cases 


inclusion based on 
27-month outcome 


restrict to cancers in Figure 3c 
12 months Extended Data Figure 2 


obtain ground truth 
localizations 


Row 1: 500 cases were selected for the reader study. The case mixture was enriched for positives as well as challenging negatives. Row 2: cases containing breast implants and those for which 
at least half of the readers indicated image-quality concerns were excluded from analysis. The remaining 465 cases are represented in Fig. 3a, b. Row 3: for further analysis, we restricted the 
cancers to those that developed within 12 months. Cases in which cancer developed later (but within 27 months) were excluded because they did not meet the follow-up criteria to be consid- 
ered negative. The remaining 408 cases are represented in Fig. 3c and Extended Data Fig. 2. Row 4: to perform localization analysis, the areas of malignancy were determined using follow-up 
biopsy data. In three instances, ground truth could not reliably be determined. The remaining 405 cases are represented in Extended Data Fig. 3. 


Extended Data Table 4 | Potential use of the Al system in two clinical applications 


a 


Simulated reduction 
of second reader 
workload (%) 


Al as second reader (UK) 66.66 96.26 87.98 
Existing workflow (UK) 67.39 96.24 Lats 4 
95% Cl on the difference (-2.68, 1.23) (-0.13, 0.17) aa 


Sensitivity (%) Specificity (%) 
(n= 414) (n = 25,422) 


Triage status 


Negative 


Positive 


Sensitivity (%) 
(95% Cl) 


99.63 
(98.88, 100.0) 
n=274 


98.05 
(96.12, 99.16) 
n= 359 


41.24 
(35.63, 47.08) 
n=274 


29.80 
(25.21, 34.45) 
n= 359 


Specificity (%) 
(95% Cl) 


A115 
(40.57, 41.72) 
n= 25,443 


34.79 
(31.97, 37.60) 
n=2,411 


99.92 
(99.89, 99.95) 
n= 25,443 


99.90 
(99.78, 99.97) 
n=2,411 


Reliability of triage 
decision (%) 
(95% Cl) 


99.99 (NPV) 
(99.97, 100.0) 
n= 10,471 


99.90 (NPV) 
(99.83, 99.96) 
n=720 


85.69 (PPV) 
(79.66, 90.98) 
n= 132 


82.41 (PPV) 
(65.38, 94.71) 
n= 121 


a, Simulation, using the UK test set, in which the Al system is used in place of the second reader when it concurs with the first reader. In cases of disagreement (12.02%) the consensus opinion 
was invoked. The high performance of this combination of human and machine suggests that approximately 88% of the effort of the second reader can be eliminated while maintaining the 
standard of care that is produced by double reading. The decision of the Al system was generated using the first reader operating point (i) shown in Fig. 2a. Confidence intervals are Wald 
intervals computed with the Obuchowski correction for clustered data. b, Evaluation of the Al system for low-latency triage. Operating points were set to perform with high NPV and PPV for 
detecting cancer in12 months. 
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Extended Data Table 5 | Discrepancies between the Al 
system and human readers 


Dataset Cancer type 


Invasive 


Al caught, 
reader missed 


Reader caught, 
Al missed 


UK In situ 7 12 
Unknown 7 2 
ILC or IDC 


Invasive cancer 
grade (UK only) 


Grade 1 


Al caught, 
reader missed 


Reader caught, 
Al missed 


Grade 2 


15 


Grade 3 


Invasive primary Al caught, Reader caught, 
tumour size (UK only) reader missed Al missed 
< 10mm 4 6 
10 -— 15mm 6 7 
15 — 20mm 5 2 
20 —- 50mm 14 4 
>= 50mm 2 1 
For the UK comparison, we used the first reader operating point (i) shown in Fig. 2a. For the US 
comparison, we used the operating point shown in Fig. 2b. ILC, invasive lobular carcinoma; 
DC, invasive ductal carcinoma; DCIS, ductal carcinoma in situ. 


Extended Data Table 6 | Performance breakdown 


a 
Cancer type (UK first reader) Al system Reader Delta (95% Cl) No. examples 
Grade 1 81.94 73.61 8.33 (-4.71, 21.38) 72 
vnveeneaeas Grade 2 63.87 62.58 1.29 (-6.60, 9.15) 155 
Grade 4 69.36 64.52 4.84 (-3.66, 13.34) 62 
Grade unknown 25 25 - 8 
High grade 58.97 53.85 5.13 (-14.19, 24.45) 39 
Intermediate grade 25 75 -50.00 (-100.00, 14.82) 8 
Sensitivity pene Low grade 56 64 -8.00 (-24.194, 8.19) 25. 
Grade unknown 69.23 76.92 -7.69 (-35.08, 19.70) 13 
< 10mm 61.81 65.46 -3.64 (-14.86, 7.59) 55 
10 -— 15mm 72.73 74.55 -1.82 (-14.66, 11.02) 55 
PUAN O Se 15 — 20mm 71.42 66.07 5.36 (-3.80, 14.51) 56 
(invasive only) 
20 — 50mm 67.3 57.43 9.90 (1.90, 17.90) 101 
>= 50mm 88.24 82.35 5.88 (-13.89, 25.65) 17 
b 
Cancer type (US clinical radiologist) Al system Reader Delta (95% Cl) No. examples 
ILC or IDC Star 45.33 12.63 (6.88, 18.39) 364 
Sensitivity DCIS 57.05 54.6 2.45 (-6.70, 11.60) 163 
Other 53.85 46.15 7.69 (-18.25, 33.64) 26 
Cc 
Breast density (US clinical radiologist) Al system Reader Delta (95% Cl) No. examples 
Entirely fatty 53.84 48.71 5.12 (-12.21, 22.46) 39 
Scattered fibroglandular densities 60.41 49.58 10.8 (3.39, 18.28) 240 
Sensitivity Heterogeneously dense 56.11 48.1 8.01 (0.93, 15.11) 237 
Extremely dense 16.67 25 -8.33 (-44.55, 27.88) 12 
Unknown 66.67 66.67 0.00 (-92.39, 92.39) & 
Entirely fatty 90.6 82.88 7.72 (-1.24, 17.40) 6 
Scattered fibroglandular densities 86.78 80.75 6.03 (1.57, 10.42) 149 
Adjusted specificity Heterogeneously dense 85.65 80.55 5.09 (0.76, 9.74) 831 
Extremely dense 92.18 77.1 15.07 (-1.90, 33.74) 1,061 
Unknown 95.34 93.01 2.33 (-25.36, 57.62) 73 
Entirely fatty 85.23 77.85 7.38 (-0.08, 14.85) 6 
Scattered fibroglandular densities 80.75 71 9.74 (5.92, 13.57) 149 
Specificity Heterogeneously dense 80.21 67.39 12.82 (9.38, 16.26) 831 
Extremely dense 86.3 75.34 10.96 (-2.50, 24.42) 1,061 
Unknown 66.67 50 16.67 (-38.32, 71.65) 73 
The analysis excludes technical recalls and US cases for which BI-RADS scores were unavailable. a, Sensitivity across cancer subtypes in the UK data. We used the first reader operating point (i) 
shown in Fig. 2a. Also shown is the performance of the first reader on the same subset. b, Sensitivity across cancer subtypes in the US data. We used the operating point shown in Fig. 2b. 


Reader performance was derived from the clinical BI-RADS scores on the same subset. ILC, invasive lobular carcinoma; IDC, invasive ductal carcinoma; DCIS, ductal carcinoma in situ. 
ce, Performance across breast density categories. BI-RADS breast density was extracted from the radiology report rendered at the time of screening, which was only available in the US dataset. 
We used the operating point shown in Fig. 2b. Adjusted specificities were computed using inverse probability weighting (Methods). 
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Extended Data Table 7 | Reader experience 


a Cee 


Reads per year No. readers 
3,000-4,000 3 
4,000-5,000 6 
5,000-6,000 3 
6,000-7,000 1 
7,000-8,000 2 

8,000+ 3 
Unknown 33 
Years of experience No. readers 
5-10 4 
10-15 5 
15-20 4 
20+ 5 
Unknown 33 
Job title No. readers 
Consultant Radiologist 8 
Consultant Radiographer 6 
Advanced Practitioner Radiographer 4 
Unknown 33 


b US reader study 


Reads per year Years of experience Fellowship trained 


Reader 1 5,500 12 Yes 
Reader 2 4,000 7 No 
Reader 3 2,000 4 No 
Reader 4 3,000 12 No 
Reader 5 3,500 15 Yes 
Reader 6 2,500 10 No 


a, Detailed information was available for 18 of the 51 readers represented in UK the test set. Reads were performed as part of routine practice and so reflect the standard of care in the UK screen- 
ing programme. b, Experience levels of the six radiologists involved in the US reader study. 
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A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
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A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
Lo AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 
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Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 
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Policy information about availability of computer code 


Data collection Dicom files were handled with the open source libraries DCMTK (https://support.dcmtk.org/docs/, version 3.6.1_20160630) and Pydicom 
(https://pydicom.github.io/, version v1.2.0). 


Data analysis The code used for training deep learning models has a large number of dependencies on internal tooling, infrastructure and hardware, 
and its release is therefore not feasible. However, all experiments and implementation details are described in sufficient detail in the 
Methods section to allow independent replication with non-proprietary libraries. Several major components of our work are available in 
open source repositories including Tensorflow (https://www.tensorflow.org, version 1.14.0) and the Tensorflow Object Detection API 
(https://github.com/tensorflow/models/tree/master/research/object_detection; Oct 15th, 2019 release). Data analysis was conducted in 
Python using the numpy (version v1.16.4), scipy (version 1.2.1), and scikit-learn (version 0.20.4) packages. 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


The dataset from Northwestern Medicine was used under license for the current study, and is not publicly available. Applications for access to the OPTIMAM 
database can be made at https://medphys.royalsurrey.nhs.uk/omidb/getting-access/. 
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The UK test set is a random sample of 10% of all women screened at two sites, St. George's and Jarvis, between the years 2012 and 2015. 
Women from the US cohort were split randomly between train (55%), validation (15%) and test (30%). This scheme follows machine learning 
convention, but errs on the side of a larger test set to power statistical comparisons and include a more representative population. 
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The size of the reader study was selected due to time and budgetary constraints. The case list was composed of 250 negative exams, 125 
biopsy-confirmed negative exams and 125 biopsy-confirmed positive exams. We sought to include sufficient positives to power statistical 
comparisons on the metric of sensitivity, while avoiding undue enrichment of the case mixture. Biopsy-confirmed negatives were included to 
make the malignancy discrimination task more difficult. 


UK Dataset 


The data was initially compiled by OPTIMAM, a Cancer Research UK effort, between the years of 2010 and 2018 from St. George’s Hospital 
(London, UK), Jarvis Breast Centre (Guildford, UK) and Addenbrooke's Hospital (Cambridge, UK). The mammograms and associated metadata 
of 137,291 women were considered for inclusion in the study. Of these, 123,964 had both screening images and uncorrupted metadata. 
Exams that were recalled for reasons other than radiographic evidence of malignancy, or episodes that were not part of routine screening 
were excluded. In total, 121,850 women had at least one eligible exam. Women who were aged below 47 at the time of the screen were 
excluded from validation and test sets, leaving 121,455 women. Finally, women for whom there was no exam with sufficient follow-up were 
excluded from validation and test. This last step resulted in the exclusion of 5,990 of 31,766 test set cases (19%). 


The test set is a random sample of 10% of all women screened at two sites, St. George’s and Jarvis, between the years 2012 and 2015. 
Insufficient data was provided to apply the sampling procedure to the third site. In assembling the test set, we randomly selected a single 
eligible screening mammogram from each woman’s record. For women with a positive biopsy, eligible mammograms were those conducted 
in the 39 months (3 years and 3 months) prior to the biopsy date. For women that never had a positive biopsy, eligible mammograms were 
those with a non-suspicious mammogram at least 21 months later. The final test set consisted of 25,856 women. 

The US dataset included records from all women that underwent a breast biopsy between 2001 and 2018. It also included a random sample 
of approximately 5% of all women who participated in screening, but were never biopsied. This heuristic was employed in order to capture all 
cancer cases (to enhance statistical power) and to curate a rich set of benign findings on which to train and test the Al system. 


US Dataset 


Among women with a completed mammogram order, we collected the records from all women with a pathology report containing the term 
“breast”. Among those that lacked such a pathology report, women whose records bore an International Classification of Diseases (ICD) code 
indicative of breast cancer were excluded. Approximately 5% of this population of unbiopsied negative women were sampled. After de- 
identification and transfer, women were excluded if their metadata was either unavailable or corrupted. The women in the dataset were split 
randomly among train (55%), validation (15%) and test (30%). For testing, a single case was chosen for each woman following a similar 
procedure as in the UK dataset. In women who underwent biopsy, we randomly chose a case from the 27 months preceding the date of 
biopsy. For women who did not undergo biopsy, one screening mammogram was randomly chosen from among those with a follow up event 
at least 21 months later. 


The radiology reports associated with cases in the test set were used to flag and exclude cases in the test set which depicted breast implants 
or were recalled for technical reasons. To compare the Al system against the clinical reads performed at this site, we employed clinicians to 
manually extract BI-RADS scores from the original radiology reports. There were some cases for which the original radiology report could not 
be located, even if a subsequent cancer diagnosis was biopsy-confirmed. This might have happened, for example, if the screening case was 
imported from an outside institution. Such cases were excluded from the clinical reader comparison. 


All attempts at replication were successful. Comparisons between Al system and human performance revealed consistent trends across three 
settings: a UK clinical environment, a US clinical environment, and an independent, laboratory-based reader study. Our findings persisted 
through numerous retrainings with random network initialization and training data iteration order. Remarkably, our findings on the US test set 
replicated even when we trained the Al system solely on UK data. 


Patients were randomized into training, validation, and test sets by applying a hash function to the deidentified medical record number. 
Assignment to each set was made based on the value of the resulting integer modulo 100. For the UK data, values of 0-9 were reserved for 
the test set. For the US data, values of 0-29 were reserved for the test set. 
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The US and UK test sets were held back from Al system development, which only took place on the training and validation sets. Investigators 
did not access test set data until models, hyperparameters, and thresholds were finalized. None of the readers who interpreted the images 
(either in the course of clinical practice or in the context of the reader study) had knowledge of any aspect of the Al system. 
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n/a | Involved in the study n/a | Involved in the study 
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Human research participants 


Policy information about studies involving human research participants 


Population characteristics The focus of the paper is on breast cancer screening, so all individuals in the population were women from the screening 
populations in the US and UK. 


The UK dataset was collected from three breast screening sites in the United Kingdom National Health Service Breast Screening 
Programme (NHSBSP). The NHSBSP invites women aged between 50 and 70 who are registered with a general practitioner (GP) 
for mammographic screening every 3 years. Women who are not registered with a GP, or who are older than 70, can self-refer 
to the screening programme. Specifically, there were 25,856 women in the test set, of which 268 (1%) had breast cancer 
detected during screening. For many cancers in the test set, additional metadata was available. There was a rich collection of 
both invasive (76.1%) and non-invasive cancers (21.6%). The invasiveness of 2.2% of cancers was unknown. These cancers had a 
lesion size of less than 10mm to lesions greater than 50mm. 


The US dataset was collected from Northwestern Memorial Hospital (Chicago, IL) between the years of 2001 and 2018. In the 
US, each screening mammogram is typically read by a single radiologist, and screens are conducted annually or biannually. The 
breast radiologists at this hospital are fellowship-trained and only interpret breast imaging studies. Their experience levels 
ranged from 1-30 years. The American College of Radiology (ACR) recommends that women start routine screening at the age of 
40, while other organizations including the US Preventive Services Task Force (USPSTF) recommend initiation at 50 for women 
with average breast cancer risk. For all the cancers in the test set, additional metadata was available. For example, 66.9% of the 
cancers were invasive, 27.9% were DCIS and the rest were of an other cancer subtype. 


Recruitment Patient data were gathered retrospectively from screening practices in the UK and US. As such, they reflect natural screening 
populations at the sites under study. Self-selection biases associated with the choice to enroll in screening may be present, but 
are likely to be representative of the real-world patient population. 


In the UK, the NHSBSP invites women aged between 50 and 70 who are registered with a general practitioner (GP) for 
mammographic screening every 3 years. Women who are not registered with a GP, or who are older than 70, can self-refer to 
the screening programme. Specifically, for this paper, the data was initially compiled by OPTIMAM, a Cancer Research UK effort, 
from three between the years of 2010 and 2018: St. George’s Hospital (London, UK), Jarvis Breast Centre (Guildford, UK) and 
Addenbrooke's Hospital (Cambridge, UK). The collected data included screening and follow-up mammograms (comprising 
mediolateral oblique “MLO” and craniocaudal “CC” views of the left and right breast), all radiologist opinions (including the 
arbitration result, if applicable) and metadata associated with follow-up treatment. The test set is a random sample of 10% of all 
women screened at two sites, St. George’s and Jarvis, between the years 2012 and 2015. Insufficient data was provided to apply 
the sampling procedure to the third site. 


In the US, the American College of Radiology, the American Cancer Society, and the US Preventive Services Task Force 
recommends screening every 1 or 2 years for women starting at age 40 or 50. The various US guidelines are summarized at 
https://www.acraccreditation.org/mammography-saves-lives/guidelines. Our US dataset was collected from Northwestern 
Memorial Hospital (Chicago, IL) between the years of 2001 and 2018. The US dataset included records from all women that 
underwent a breast biopsy between 2001 and 2018. It also included a random sample of approximately 5% of all women who 
participated in screening, but were never biopsied. This heuristic was employed in order to capture all cancer cases (to enhance 
statistical power) and to curate a rich set of benign findings on which to train and test the Al system. 


Ethics oversight Use of the UK dataset for research collaborations by both commercial and non-commercial organisations received ethical 
approval (Research Ethics Committee reference 14/SC/0258). 
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The US data was fully de-identified and released only after an Institutional Review Board approval (STUO0206925). 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 
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Mycobacterium tuberculosis (Mtb) is the leading cause of death from infection 
worldwide’. The only available vaccine, BCG (Bacillus Calmette-Guérin), is given 
intradermally and has variable efficacy against pulmonary tuberculosis, the major 
cause of mortality and disease transmission’”. Here we show that intravenous 
administration of BCG profoundly alters the protective outcome of Mtb challenge in 
non-human primates (Macaca mulatta). Compared with intradermal or aerosol 
delivery, intravenous immunization induced substantially more antigen-responsive 
CD4 and CD8 T cell responses in blood, spleen, bronchoalveolar lavage and lung 
lymph nodes. Moreover, intravenous immunization induced a high frequency of 
antigen-responsive T cells across all lung parenchymal tissues. Six months after BCG 
vaccination, macaques were challenged with virulent Mtb. Notably, nine out of ten 
macaques that received intravenous BCG vaccination were highly protected, with six 
macaques showing no detectable levels of infection, as determined by positron 
emission tomography—computed tomography imaging, mycobacterial growth, 
pathology and granuloma formation. The finding that intravenous BCG prevents or 
substantially limits Mtb infection in highly susceptible rhesus macaques has 
important implications for vaccine delivery and clinical development, and provides a 
model for defining immune correlates and mechanisms of vaccine-elicited protection 
against tuberculosis. 


Two billion people worldwide are infected with Mtb, with 10 million 
new cases of active tuberculosis (TB) and 1.7 million deaths each year’. 
Prevention of pulmonary infection or disease in adolescents and adults 
would have the largest effect on the epidemic by controlling Mtb trans- 
mission?. The only licensed TB vaccine, BCG (live, attenuated Myco- 
bacterium bovis), is administered intradermally at birth and provides 
protection against disseminated TB in infants but has variable efficacy 
against pulmonary disease in adolescents and adults’. 

T cell immunity is required to control Mtb infection and prevent 
clinical disease*. A major hurdle to developing an effective and durable 
T-cell-based vaccine against pulmonary TB is to induce and sustain 
T cell responses in the lung to immediately control infection while 
also eliciting a reservoir of systemic memory cells to replenish the 
lung tissue. Intradermal and intramuscular administration—the most 
common routes of vaccine administration—do not induce high frequen- 
cies of resident memory T (Tay) cells in the lung. Studies performed 
50 years ago suggested that administration of BCG by aerosol (AE) or 


intravenous (IV) routes enhanced protection in non-human primates 
(NHPs) challenged shortly after immunization> §. However, there 
remains a limited understanding for mechanisms by which dose and 
route of BCGinfluence systemic and tissue-specific T cell immunity, and 
whether optimizing these variables would lead to high-level prevention 
of Mtb infection and disease. We hypothesized that a sufficiently high 
dose of IV BCG would elicit a high frequency of systemic and tissue 
resident T cells mediating durable protection against Mtb infection 
and disease in highly susceptible rhesus macaques. 


Experimental design and safety 


The central aim of this study was to assess how the route and dose of 
BCG vaccination influence systemic and tissue-resident T cell immunity, 
and protection after Mtb challenge. Rhesus macaques were vaccinated 
with 5 x 10’ colony-forming units (CFUs) of BCG by intradermal (ID, is), 
AEorIVroutes, or with a combination of both AE (5 x 10’ CFUs) and ID 
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(5 x 10° CFUs; AE/ID) (Extended Data Fig. 1a). Immune responses and 
protective efficacy of these regimens were compared to the standard 
human dose given ID (5 x 10° CFUs; ID,,,,). The dose of BCG selected 
for AEand IV vaccine groups was based on pilot dose-ranging studies 
(Supplementary Data 1). After BCG vaccination, immune responses in 
blood and bronchoalveolar lavage (BAL) were assessed over 24 weeks, 
after which NHPs were challenged with a low dose of Mtb (Extended 
Data Fig. 1b). Other macaques in each group were euthanized 1 or 
6 months after vaccination for immune analysis of tissue responses 
(Extended Data Fig. Ic). To assess safety of BCG vaccinations, several 
clinical parameters were measured and found to be transiently affected 
by only IV BCG (Extended Data Fig. 2). Asummary of all NHPs in this 
study and doses of BCG and Mtb administered are provided in Extended 
Data Fig. 1c and Supplementary Table 1. 


Cellular composition of BAL and blood 


Because generating immune responses in the lung was a major focus 
of the study, we first assessed whether the BCG vaccination regimen 
altered the number or composition of leukocytes in the BAL. Only IV 
BCG vaccination elicited significant changes in BAL cell numbers: a 
5-10-fold increase in total cells, accounted for largely by conventional 
T cells (Fig. 1a and Supplementary Data 2a, b). This resulted in a sus- 
tained inversion of the alveolar macrophage:T-cell ratio up to 6 months 
after IV BCG vaccination (Extended Data Fig. 3a). Non-classical T cells 
(MAIT and Vy9 y6) that can contribute to protection against TB’ “were 
transiently increased 2-4 weeks after IV BCG (Fig. 1a, Extended Data 
Fig. 3b and Supplementary Data 2b). A similar analysis performed on 
peripheral blood mononuclear cells (PBMCs) showed no significant 
changes in leukocyte composition (Extended Data Fig. 3c, d). Neither 
BAL nor PBMCs exhibited changes in the proportion of natural killer 
cells, which were recently suggested to correlate with protection” 
(Extended Data Fig. 3a, c). Finally, there were no increases in cytokines 
associated with trained innate immunity” in stimulated PBMCs after 
ID or IV BCG immunization (Supplementary Data 3). Overall, these 
data show that IV BCG immunization, in contrast to AE or ID, results 
in significant and sustained recruitment of T cells to the airways and 
substantially alters the ratio of T cells to macrophages. 


Antigen-responsive adaptive immunity 


We next evaluated how these regimens influenced the ability of T cells 
responsive to mycobacterial antigen (such as purified protein deriva- 
tive (PPD)) to produce the canonical cytokines (IFNy, IL-2, TNF or IL-17) 
that are important for protection against TB*"*”. At the peak of the 
PBMC response (week 4), cytokine-producing CD4 T cells were higher 
in NHPs immunized with ID,;., or IV BCG compared with those immu- 
nized with ID,,,, BCG; these responses declined over time but remained 
increased at week 24 (time of challenge; Fig. 1b and Extended Data 
Fig. 4a, g). PBMC CD8 responses in IV-immunized NHPs were greater 
than ID,,,, NHPs at both time points (Fig. 1c and Extended Data Fig. 4b, h). 
In BAL, antigen-responsive T cells peaked at 8 weeks and were largely 
maintained until time of challenge (Fig. 1d, e and Extended Data Fig. 4c, 
d). Compared with ID,, BCG, ID; ig, or AE BCG immunization elicited 
tenfold more PPD-responding CD4 T cells in BAL; IV BCG elicited 
100-fold more PPD-responsive CD4 T cells, with approximately 40% 
of cells responding (Fig. 1d). Furthermore, only IV BCG induced an 
increase in antigen-responsive CD8 T cells (Fig. le). Central memory 
and transitional memory (T;,,) T cells’ comprised the majority of CD4 
T cell responses in PBMCs across all vaccine groups at the peak of the 
response, whereas T,,, cells predominated in the BAL (Extended Data 
Fig. 4e, f). IV-BCG-vaccinated NHPs had the largest proportion of T,., 
cells in PBMCs and effector memory (T;,,,) cells in BAL. 

Despite differences in the magnitude of T cell responses among 
vaccine regimens, there were no differences in the quality of T cell 
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responses (thatis, the proportion of cells producing each combination 
of IFNy, IL-2, TNF and IL-17)’?”° in PBMCs (Extended Data Fig. 5a and 
Supplementary Data 4) or the BAL (Extended Data Fig. 5b and Sup- 
plementary Data 5). Of the CD4 T cell responses, 90% consisted of 
Thelper1(T,,1) cytokines, with fewer than 10% also producing IL-17; most 
IL-17-producing CD4 T cells co-expressed T,,1 cytokines (Extended Data 
Fig. 5). Notably, approximately 10% of antigen-responsive CD4 T cells 
in PBMCs expressed CD154” but no T,,1 or T,,17 cytokines (Extended 
Data Fig. 5aand Supplementary Data 4), which suggests that there may 
be underlying qualitative differences among vaccine group responses 
that are not measured by the canonical T cell cytokines commonly used 
to assess BCG-elicited immunity”. 

To expand the qualitative analysis of BAL T cell responses using an 
orthogonal approach, we performed single-cell mRNA sequencing 
(scRNA-seq) with Seq-Well”* to comprehensively assess phenotypic 
and transcriptional states among T cells that might underlie protective 
vaccine responses (Fig. 1f-h, Extended Data Fig. 6 and Supplementary 
Data 6). We examined correlated patterns of gene expression within 
unstimulated and PPD-stimulated T cells from BAL to identify groups of 
genes for which the coordinated activity differed by regimen (Extended 
Data Fig. 6b). A total of seven significant T cell modules were identi- 
fied among in vitro-stimulated T cells 13 weeks after immunization 
(Supplementary Table 2) and used to generate expression scores across 
all T cells at weeks 13 and 25. Among these, we identified a stimulation- 
inducible module of gene expression, module 2, enriched for memory 
T cell functionality (Supplementary Table 3 and Methods), primarily 
expressed ina population of BAL CD4 T cells from IV-BCG-immunized 
NHPs at week 13, and maintained until week 25 (Fig. 1f, g, Extended Data 
Fig. 6c, d and Supplementary Table 2). Differential gene expression 
analysis, comparing T cells positive and negative for module 2 (Fig. 1h 
and Supplementary Table 4), showed enrichment of genes previously 
associated with protection against TB including /FNG, TBX21, RORC, 
TNFSF8* and IL21R”°. 

To further analyse adaptive immunity, we found that IV BCG elicited 
higher antibody responses in the BAL and plasma than the other routes. 
Mtb-specific IgG, IgA and IgM peaked 4 weeks after IV BCG vaccination 
and returned to baseline by 24 weeks in the BAL (Extended Data Fig. 7). 


M. tuberculosis challenge outcome 


Six months after BCG immunization, NHPs were challenged in three 
separate cohorts with a nominal dose of 10 CFUs of the highly patho- 
genic Mtb Erdmanstrain, witha pre-defined study end point of 12 weeks 
after challenge (Extended Data Fig. 1b, cand Supplementary Table 1). 
Infection and disease were tracked serially using “F-fluorodeoxyglu- 
cose (FDG) positron emission tomography-computed tomography 
(PET-CT) imaging. Total FDG activity in lungs, a measure of cellular 
metabolism that correlates with total thoracic mycobacterial bur- 
den”’’, was negative in all immunized macaques before Mtb challenge, 
but was increased throughout infection in unvaccinated NHPs (Fig. 2a). 
Three-dimensional reconstructions of pre-necropsy PET-CT scans are 
shown in Fig. 2b. AIIID,,,- and AE-BCG-immunized NHPs had increased 
FDG activity in lungs over 12 weeks. Two NHPs in the ID,,and AE/ID BCG 
groups had no lung FDG activity and two NHPs in the ID,i, group had 
inflammation at 8 weeks that returned to baseline by 12 weeks, suggest- 
ing partial protection. By contrast, nine out of tenIV-BCG-immunized 
NHPs had no lung FDG activity throughout the challenge phase (Fisher’s 
exact test, P< 10* compared to ID, BCG) (Fig. 2a-c). 

PET-CT was used to track granuloma formation after Mtb infection 
asa correlate of active disease’. By 4 weeks and throughout infection, 
granulomas were detected in all unvaccinated as well as IDjoy-, [Dpigh", 
AE- and AE/ID-BCG-immunized NHPs (Fig. 2a). By contrast, IV-BCG- 
immunized NHPs had fewer granulomas compared with the bench- 
mark ID,,y BCG regimen (P< 0.001), with six out of ten NHPs having 
no granulomas throughout infection (Fig. 2a, d). Detailed necropsies 
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Fig. 1| Cellular composition and immune analysis in blood and BAL after 
BCG vaccination. a, Number of cells (geometric mean) per BAL collection for 
leukocyte populations in each vaccine group before (pre, P) and up to 24 weeks 
after BCG; Supplementary Data 2 shows individual NHPs and statistical 
comparisons. Data are from cohorts 1-4 (n=11-13 macaques per group as 
outlined in Extended Data Fig. 1) except at weeks 2, 20 and 24 (cohort 4 only, 
n=3). Vy9"", Vy9" yT cells; MAIT, mucosal-associated invariant T cells; mDC, 
myeloid dendritic cells; NK, natural killer cells; iNKT, invariant natural killer 
cells; pDC, plasmacytoid dendritic cells. b,c, Percentage of memory CD4 (b) or 
CD8 (c) T cells in PBMCs producing IFNy, IL-2, TNF or IL-17 after PPD stimulation 
in vitro. Shown are individual and median (horizontal bar) responses for NHPs 
in challenge study (cohorts 1-3, n= 8-10 macaques) at weeks 4 (peak) and 24 
(time of challenge) after BCG vaccination. d, e, Percentage (top) and number 
(bottom) of cytokine* memory CD4 (d) and CD8 (e) T cells in the BAL before and 
up to 16 weeks after BCG vaccination. Shown are individual (grey lines) and 
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mean (coloured lines) responses for challenge cohorts (n = 8-10 macaques). 
Each group was compared to ID,,y at weeks 4 and 24 for PBMCs (one-way 
ANOVA; Pvalues are Dunnett’s multiple comparison test) or weeks 8 and 16 for 
BAL (Kruskal-Wallis test; Pvalues are Dunn’s multiple comparison test). 

f-h, Single-cell transcriptional analysis of BAL cells at weeks 13 and 25 after BCG 
vaccination (cohort 4; n=3 per group). f, Z-scored heat maps of the average 
cellular score for modules identified in week 13 PPD-stimulated T cells at weeks 
13 and 25 after BCG vaccination. Red P values indicate modules uniquely 
elevated in the IV BCG group (one-way ANOVA). g, Distributions of module 2 
expression in unstimulated and stimulated T cells at weeks 13 and 25 for each 
group. Percentage module 2-positive is shown; positivity (dashed line) defined 
as 2s.d. above the mean score of the unvaccinated (Naive) NHPs. h, Volcano 
plot showing differentially expressed genes between T cells positive and 
negative for module 2 at week 13 (Pvalues calculated using the likelihood ratio 
test with Bonferroni correction). 
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Fig. 2| Protection against Mtb infection after IV BCG immunization. a, Lung 
inflammation (total FDG activity) and number of lung granulomas over the 
course of infection as measured by serial PET-CT scans. Each line shows one 
NHP over time; 3 NHPs (2 unvaccinated (unvax) and 1ID,,,,) reached a humane 
end point before 12 weeks. tntc, too numerous to count. b, Three-dimensional 
volume renderings of PET-CT scans of each NHP at the time of necropsy. PET 
was limited to the thoracic cavity; the standardized uptake value colour bar is 
shown in the top right and indicates FDG retention, a surrogate for 
inflammation. c-h, Total lung FDG activity (c), number of lung granulomas (d), 


showed that the IV-BCG-immunized group had lower gross pathol- 
ogy scores” (Fig. 2e) compared with the ID,,,, BCG group (P= 0.002) 
and was the only group without detectable extrapulmonary disease 
(Extended Data Fig. 8a). 

The primary measure of protection was a comprehensive quantifi- 
cation of Mtb burden (CFUs) at necropsy. The median total thoracic 
CFUs for ID,,, BCG (5.1 + 1.3, median + interquartile range of logjo- 
transformed total CFUs) was slightly lower than that of unvaccinated 
NHPs (5.9 + 1.0 log,)-transformed CFUs), consistent with ID,,,, BCG 
having a minimal protective effect in rhesus macaques (Fig. 2f). By 
contrast, the median total thoracic CFUs in IV-BCG-immunized NHPs 
was 0 (+16 CFUs)—a more than 100,000-fold reduction compared with 
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gross pathology score (e), total thoracic CFUs (mycobacterial burden) (f), total 
lung CFUs (g) and total thoracic LN CFUs (h) at time of necropsy. Dashed line in 
eis assumed normal pathology score accounting for variability in LN size in 
healthy rhesus macaques. c-h, Symbols represent individual challenged 
macaques (cohorts 1-3, n= 8-10 vaccinated NHPs; n=4 unvaccinated NHPs) 
and horizontal bars represent the median; all data points within the grey areas 
are zero. Kruskal-Wallis tests were used and reported Pvalues represent 
Dunn’s multiple comparison test comparing each group tothe ID,,, group. 


ID,,y, BCG (P= 0.006). Six out of ten IV-BCG-immunized macaques had 
no detectable Mtb in any tissue measured, and another three macaques 
had <45 total CFUs, all contained within one granuloma. Only one of 
ten IV BCG NHPs was not protected, with CFU values similar to ID,,,, 
NHPs (Fig. 2f). The IDpigh, AE and AE/ID groups had bacterial burdens 
similar to ID,,,, BCG. 

Total thoracic bacterial burden can be separated into lung (Fig. 2g) 
and thoracic lymph node (LN) (Fig. 2h) CFUs. Only the IV BCG group 
was lower than the ID,,,, BCG group (lung, P= 0.006; LNs, P= 0.001), 
with nine of ten NHPs having no Mtb-positive LNs (Fig. 2h). 

Protection can be defined as having less than a given number of total 
thoracic Mtb CFUs. By this criterion, protection was highly significant 


(Fisher’s exact test, P< 10“) at any given threshold less than 10,000 
CFUs (Extended Data Fig. 8b), with the IV BCG group showing 90% 
protection (95% confidence interval: 60-98%) at a threshold as low as 
50 CFUs. Thus, BCGIV confers an unprecedented degree of protection 
in astringent NHP model of TB. 


Immune responses after Mtb challenge 


Measuring immune responses after challenge informs whether 
vaccine-elicited responses are boosted (anamnestic), and if de novo 
(primary) responses are generated to antigens expressed by the chal- 
lenge microorganism (but not the vaccine). T cell responses to ESAT-6 
and CFP-10—proteins expressed in Mtb but not BCG—are used to detect 
primary Mtb infection, even in BCG-immunized individuals. Peripheral 
Tcell and antibody responses to these Mtb-specific antigens and those 
expressed by both BCG and Mtb (for example, PPD), were assessed after 
Mtb challenge (Extended Data Fig. 9). In contrast to all other groups, 
IV-BCG-immunized NHPs had low to undetectable primary or anam- 
nestic T cell and antibody responses after TB infection, which suggests 
rapid elimination of Mtb after challenge. 


BCG and immune responses in tissues 


To provide insight into the potential mechanisms of IV-BCG-induced 
protection, we quantified BCG CFUs and T cell responses in tissues 
1month after vaccination. BCG was detected at the skin site(s) of injec- 
tion and draining axillary LNs in ID-BCG-vaccinated NHPs, but notin 
lung lobes (Fig. 3a). In AE- or AE/ID-BCG-vaccinated NHPs, BCG was 
detected primarily in lung lobes and BAL. By contrast, BCG was detected 
inthe spleen of all four IV-BCG-vaccinated NHPs, as well as in BAL, lung 
lobe, and peripheral and lung LNs (Fig. 3a). Indeed, PET-CT scans at 
2 and 4 weeks after BCG vaccination showed increased metabolism 
localized to lung LNs, lung lobes and spleen elicited by the IV but not 
by other routes (Extended Data Fig. 10a). 

CD4 T cell responses in IV-BCG-immunized NHPs were increased 
in spleen and lung compared to ID,,,, NHPs (Fig. 3b), consistent with 
detection of BCG at the same sites. Moreover, CD4 T cell responses 
were observed in systemic sites such as PBMCs, bone marrow and 
peripheral LNs. CD8 responses were highest in lung lobes, BAL and 
spleen after IV BCG (Fig. 3c). After ID,;., BCG vaccination, CD4 T cell 
responses were detected in spleen, bone marrow and axillary LNs, 
but were limited in lung lobes and lung LNs, whereas responses in AE 
groups were confined to the lung and BAL. Collectively, these data 
indicate compartmentalization of BCG detection and T cell immunity 
by vaccine route, which highlights the systemic distribution of immune 
responses after IV BCG versus the more limited and localized responses 
following ID and AE delivery. 

Further analysis of lung tissue one month after vaccination showed 
increased cell counts (Fig. 3d) after IV BCG with increased numbers 
of CD3* T cells and CD11c* antigen-presenting cells (Fig. 3e). These 
clustered into ‘microgranulomas’ that were histologically distinct 
from bronchus-associated lymphoid tissue (BALT) (Fig. 3f). IV-BCG- 
vaccinated macaques had transient splenomegaly as well as enlarged 
thoracic LNs that contained non-necrotizing granulomas and lymphoid 
follicular hyperplasia, often with active germinal centres (Extended 
Data Fig. 10b-e). 

Six months after BCG vaccination (time of challenge), NHPs that 
received IV BCG maintained increased frequencies of antigen-respon- 
sive T cells in spleen, lung and BAL (Extended Data Fig. 11a, b). Notably, 
the numbers of total, CD3* or CD11c’* cells in lung tissue had normal- 
ized, and lung histopathology, spleen size and FDG uptake in IV-BCG- 
vaccinated macaques were indistinguishable from ID,,, BCG macaques 
(Extended Data Fig. 11c-g). Although BCG burden was not measured 
in these NHPs, no BCG (or Mtb) CFUs were detected in six out of ten 
IV-BCG-immunized, challenged macaques at 9 months after BCG. 


Collectively, these data suggest that BCG is cleared between 1 and 9 
months after IV vaccination. 


T cells in lung tissue after BCG 


To substantiate whether T cells isolated from lung lobes one month 
after IV BCG were Try cells, labelled anti-CD45 antibody was injected 
IV into NHPs just before necropsy—a technique shown to delineate 
tissue-derived (ivCD45_) from vasculature-derived (ivCD45°*) leuko- 
cytes”**°, Ex vivo phenotypic analysis of CD69 expression (a marker 
of Tay, and/or T cell activation) in combination with ivCD45 staining 
revealed that more than 80% of CD4 T cells isolated from all lung lobes 
of IV-BCG-immunized NHPs were derived from the lung parenchyma 
(CD69*ivCD45 ) (Fig. 4a). Of note, more than 1,000 BCG CFUs were 
cultured from every lung lobe in this macaque. By contrast, ID,iz, and 
AE BCG vaccination resulted in 16-35% tissue-derived (CD69*ivCD45 ) 
CD4 T cells inthe lung lobes, with few or undetectable BCG CFUs. T cells 
from BAL in all NHPs were uniformly CD69*ivCD45_. Similar results 
were observed in the CD8 T cell compartment of the same macaques 
(Supplementary Data 7). 

After in vitro antigen stimulation to assess antigen-responsive T cells 
in tissue, lung tissue-derived (ivCD45_) IFNy-producing CD4 T cells were 
observed in all lung lobes and lung LNs of IV-BCG-immunized NHPs 
(Fig. 4b and Extended Data Fig. 12). Antigen-responsive lung T cells were 
largely CD69* witha subset also expressing the tissue-homing marker 
CD103, which is expressed on some Tay cells” (Fig. 4c). Thus, these cells 
may represent bona fide T,,, cells, or recently activated T cells owing 
to the presence of BCG (Fig. 4a). Overall, these data show that IV BCG 
vaccination provided the highest level of protection concomitant with 
increased antigen-responsive T cells throughout lung tissue. 

The increased detection of T cell responses in tissues containing BCG 
suggests that alternative approaches to lung vaccine delivery may be 
crucial for generating Tp, cells. Indeed, direct endobronchial instil- 
lation of BCG into a single lung lobe protected two out of eight NHPs 
against Mtb challenge in the same lobe”. To determine how endobron- 
chial BCG would affect T cells in the lung parenchyma, BCG was instilled 
directly into the left lung lobes of NHPs. Approximately 75% of CD4 and 
CD8T cells isolated from the two left lung lobes were CD69*ivCD45 _, 
compared with 7-45% in the right lobes (Fig. 4a and Supplementary 
Data 7a). Notably, BCG CFUs (>10*) were detected in the left (but not 
right) lung lobes where the CD4 T cell response was highest (Extended 
Data Fig. 12). Collectively, these data suggest a general concordance 
between the presence of BCG in a given tissue after vaccination and 
the detection of antigen-responsive T cells. 


Immune associations of bacterial control 


Several multiple regressions were used to test whether peak antigen- 
responsive CD4 or CD8 T cells in the BAL or PBMCs after BCG immu- 
nization were associated with disease severity (Extended Data Fig. 13, 
Supplementary Tables 1 and 5). These analyses show that the route of 
BCG vaccination was the primary determinant of Mtb control with IV 
being the only regimen that afforded significant protection (Extended 
Data Fig. 8b). 


Discussion 


The data demonstrating that |V BCG immunization results in markedly 
increased antigen-responsive T cells, including T cells systemically 
and throughout the lung parenchyma, and unprecedented protection 
against Mtb challenge, represent a major step forward in the field of 
TB vaccine research. 

The concept of alternative immunization routes rather than the stand- 
ard ID approach was suggested 50 years ago in NHP studies comparing 
IV and AE immunization* *. More recently, decreased lung pathology 
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Fig. 3|BCG CFUs and immune responses in tissues one month after BCG 
immunization. NHPs (cohorts 5a-c: IDjow, [Dpign and IV, n= 4 NHPs; AE and AE/ 
ID, n=2 NHPs) were euthanized one month after vaccination to quantify BCG 
and T cell responses in tissues. a, BCG CFUs at vaccination site(s) (skin, ID only) 
and in various tissues (per ml blood or bone marrow; per whole spleen, LN or 
lung lobe; or per total BAL collected). L, left; R, right; ND, not determined. 

b,c, Frequency of memory CD4 (b) and CD8 (c) T cells producing IFNy, IL-2, TNF 
or IL-17 after PPD stimulation. Matched symbols within each vaccine group are 
the same macaque. Kruskal-Wallis tests were run and reported Pvalues 
represent Dunn’s multiple comparison test comparing each group tothe ID,y 
group. d, Total viable cells per gram of lung tissue for each vaccine regimen; 


and atrend towards increased survival was reported after IV BCG immu- 
nization compared with unvaccinated NHPs*. AE immunization with 
an attenuated Mtb strain enhanced cellular immunity in the BAL, and 
reduced lung pathology and bacterial burdens, after high-dose chal- 
lenge 8 weeks later with a low virulence Mtb strain (CDC1551)*. In differ- 
ent method of pulmonary delivery, BCG instilled directly into the lower 
left lung lobe (that is, endobronchially), prevented infection and disease 
in two out of eight NHPs after repeated limiting-dose Mtb challenge in 
the same lung lobe, starting 13 weeks after vaccination™. The robust 
and localized T cell responses in lung tissue after direct BCG instillation 
(Fig. 4a and Extended Data Fig. 12d) provide a potential mechanistic 
difference between direct endobronchial and AE delivery that could 
influence protection. Finally, acytomegalovirus (CMV) vector encoding 
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data are shownas the median of four macaques per group (solid symbols, six 
lung lobes from each NHP are averaged) or as counts for each lung lobe (n=24 
lobes) from all NHPs (open symbols with lobes from same macaque matched). 
Kruskal-Wallis test was run on medians; Dunn’s adjusted P values are from 
comparing each group tothe ID,,, group. e, Quantification of CD3*, CD20* and 
CD11c’* cells from two lung sections per NHP (matched symbols,n=2 
macaques). f, Representative (one out of four) 1mm? lung sections from each 
BCG regimen stained with haematoxylin and eosin (H&E; top) or with 
antibodies against CD3*T cells (red), CD20* B cells (green), and CD11c* 
macrophages or dendritic cells (blue). 


Mtb antigens prevented TB disease in 14 out of 34 macaques across two 
studies, with 10 out of 14 being Mtb culture-negative®. In contrast to IV 
BCG immunization, all CMV-immunized macaques generated primary 
responses to Mtb antigens after challenge, suggesting that these vac- 
cines elicit distinct mechanisms or kinetics of protection. 

There are at least three immune mechanisms for how IV BCG may 
mediate protection. First, rapid elimination of Mtb may be due to the 
high magnitude of T cell responses in lung tissue. Our data are con- 
sistent with studies in mice that demonstrate the superior capacity of 
lung-localized Tx, cells to control TB disease**”, and studies in NHPs 
showing that depletion of lung interstitial CD4 T cells during SIV infec- 
tion of Mtb latently infected NHPs is associated with reactivation and 
dissemination*®. Second, there is some evidence that antibodies can 
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Fig. 4| Detection of T cells in lung tissue after IV BCG immunization. a, One 
month after BCG vaccination, tissue-derived versus blood-derived cells in lung 
were delineated by injecting NHPs witha fluorochrome-conjugated anti-CD45 
antibody (ivCD45) to label leukocytes in the vasculature. NHPs (cohort 6,n=2 
macaques) received 5 x 10’ CFUs BCGID, IV, AE or endobronchially (EB) into the 
left lung. At necropsy, BCG CFUs were quantified in tissues and cells were 
stained immediately ex vivo for surface marker expression (a) or stimulated 
with Mtb whole-cell lysate (WCL) and stained for cytokine production (b,c). 


mediate control against Mtb in vivo or in vitro*’*°. Antibody levels were 
higher inthe BAL and plasma after IV BCG compared with other routes 
of vaccination, but declined to pre-vaccination levels inthe BAL at the 
time of challenge (Extended Data Fig. 7). Third, IV BCG vaccination in 
mice induced epigenetically modified macrophages with enhanced 
capacity to protect against Mtb infection“, a process termed ‘trained 
immunity’. Suchan effect was dependent on BCG being detectable 
inthe bone marrow; this was not observed one month after IV BCG vac- 
cination in NHPs (Fig. 3a). Moreover, there was no increase in innate 
activation of PBMCs to non-Mtb antigens after IV BCG vaccination—a 
hallmark of trained immunity (Supplementary Data 3). Nonetheless, it 
is possible that any of these three mechanisms might act independently 
or together to mediate protection. 

Because nine out of ten macaques were protected by IV BCG immu- 
nization (Fig. 2), we were unable to define an immune correlate of pro- 
tection within this group (Extended Data Fig. 13); however, there were 
several unique quantitative and qualitative differences in the immune 
responses after IV BCG vaccination that may underlie protection. First, 
there were substantially higher numbers of Mtb antigen-responsive 
Tcellsinthe BAL and PBMCs (Fig. 1b-e). Second, there was a unique CD4 
Tcell transcriptional profile in the BAL, whichincluded upregulation of 
genes that have been associated with protection against TB (Fig. 1f-h). 
Third, and perhaps most noteworthy, was the large population of T cells 
in the tissue across all lung parenchyma lobes (Fig. 4, Extended Data 


0 


Plots show CD4 T cells from the BAL and lung lobes (RU, right upper; RM, right 
middle; RL, right lower; LU, left upper; LL, left lower) from one of two macaques 
per BCG regimen. a, Percentage of ivCD45 (unstimulated) CD4 T cells 
expressing the tissue-resident/activation marker CD69; BCG CFUs (if detected) 
are indicated by red bars and right scale. b, Percentage of WCL-responsive 
(IFNy*) CD4 T cells in BAL and lung tissue (ivCD45 ) and (c) the percentage of 
IFNy* CD4 memory T cells expressing CD69 and CD103 after 1VBCG 
vaccination. 


Fig. 12 and Supplementary Data 7). Notably, although the BAL CD4 
T cell responses were higher in ID,i.,", AE- and AE/ID-BCG-immunized 
NHPs compared to the ID,,,, BCG group, there was no increased protec- 
tion. These data suggest that although measurement of BAL responses 
may provide greater insight into vaccine efficacy compared to blood, 
they may not fully reflect lung T,,, cell responses that might be the 
mechanism of protection. 

Inconclusion, this study provides a paradigm shift towards develop- 
ing vaccines focused on preventing TB infection to prevent latency, 
active disease and transmission. The data support clinical development 
of 1V delivery of BCG for use in adolescents or adults in whom modelling 
predicts the greatest effect on TB transmission’, and suggest that the IV 
route may improve the protective capacity of other vaccine platforms. 
This study also provides a benchmark against which future vaccines will 
be tested and a new framework to understand the immune correlates 
and mechanisms of protection against TB. 
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Methods 


Macaques and sample size 
Indian-origin rhesus macaques (Macaca mulatta) used in these studies 
are outlined in Extended Data Fig. lc and Supplementary Table 1. All 
experimentation complied with ethical regulations at the respective 
institutions (Animal Care and Use Committees of the Vaccine Research 
Center, NIAID, NIH and of Bioqual, Inc., and of the Institutional Animal 
Care and Use Committee of the University of Pittsburgh). Macaques 
were housed and cared for in accordance with local, state, federal, and 
institute policies in facilities accredited by the American Association for 
Accreditation of Laboratory Animal Care (AAALAC), under standards 
established in the Animal Welfare Act and the Guide for the Care and Use 
of Laboratory Animals. Macaques were monitored for physical health, 
food consumption, body weight, temperature, complete blood counts, 
and serum chemistries. All infections were performed at the University 
of Pittsburgh where animals were housed in a biosafety level 3 facility. 
The sample size for this study was determined using bacterial burden 
(measured as log, )-transformed total thoracic CFUs) as the primary 
outcome variable. Initially, we planned to test BCG route efficacy 
by comparing IV, AE and AE/ID routes to ID,,,, vaccination and found 
that ten macaques per group would be sufficient to obtain over 90% 
power and adjusted the type | error rate for three group comparisons 
(a= 0.0167). After initiation of the first cohort of NHPs in this study, 
we elected to test the effect of dose on ID vaccination by adding an 
IDpigh group (n =8 macaques). The additional treatment group did not 
substantially reduce the power of the study. To detect a 1.5 difference 
inlog,)(total CFUs) witha pooled standard deviation of 0.8 (using pre- 
vious data), we obtained over 90% (90.7%) power using 10 macaques 
per group with an adjusted type error rate for 4 group comparisons 
(a=0.0125). The comparison made between the ID, ig, (2 = 8 macaques) 
and ID,,,, (n=10 macaques) groups achieved 85.6% power detecting the 
same difference (log,,(1.5)) and with an a= 0.0125. 


BCG vaccination 

For Mtb challenge studies (cohorts 1-3), 3-5-year-old male (n=32) and 
female (n = 20) rhesus macaques were randomized into experimental 
groups based on gender, weight and pre-vaccination CD4 T cell responses 
to PPD in BAL. Macaques were vaccinated at Bioqual, Inc. under seda- 
tion and in successive cohorts as outlined in Extended Data Fig. 1c. BCG 
Danish Strain 1331 (Statens Serum Institute, Copenhagen, Denmark) 
was expanded”, frozen at approximately 3 x 10° CFUs mI“ in single-use 
aliquots and stored at -80 °C. Immediately before injection, BCG (for 
all vaccine routes) was thawed and diluted in cold PBS containing 0.05% 
tyloxapol (Sigma-Aldrich) and 0.002% antifoam Y-30 (Sigma-Aldrich) 
to prevent clumping of BCG and foaming during aerosolization®. For 
ID vaccinations, BCG was injected in the left upper arm (5 x 10° CFUs; 
ID,.y) or split across both upper arms (5 x 10’ CFUs; ID, i.) ina volume of 
100-200 pI per site. IV BCG (5 x 10’ CFUs) was injected into the left saphe- 
nous vein ina volume of 2 ml; AE BCG (5 x 10’ CFUs) was delivered ina2 
ml volume via paediatric mask attached to a Pari eFlow nebulizer (PARI 
Pharma GmgH) that delivered 4 uM particles into the lung, as previously 
described’’; AE/ID macaques were immunized simultaneously (5 x 10” 
CFUs AE plus 5 x 10° CFUs ID in left arm); EB BCG (5 x 10’ CFUs in 2 ml; 
cohort 6 only) was instilled into the left lung lobes using an endoscope. 
No loss of viability was observed for BCG after aerosolization. In pilot 
studies, lower doses of BCG were prepared and delivered as described 
above. Text refers to nominal BCG doses—actual BCG CFUs for vaccine 
regimens in every cohort were quantified immediately after vaccination 
and are reported in Extended Data Fig. 1c and Supplementary Table 1. 


Mtb challenge 

Macaques (cohorts 1-3) were challenged by bronchoscope with 4-36 
CFUs barcoded Mtb Erdman 6-10 months after BCG vaccination 
(Extended Data Fig. 1c and Supplementary Table 1) ina 2 ml volume 


as previously described“. Infectious doses across this range result in 
similar levels of TB disease in unvaccinated rhesus in this and previous 
studies*s (Supplementary Data 12). Clinical monitoring included regular 
monitoring of appetite, behaviour and activity, weight, erythrocyte 
sedimentation rate, Mtb growth from gastric aspirate and coughing. 
These signs, as well as PET-CT characteristics, were used as criteriain 
determining whether a macaque met the humane end point before the 
pre-determined study end point. 


PET-CT scans and analysis 

PET-CT scans were performed using a microPET Focus 220 preclinical 
PET scanner (Siemens Molecular Solutions) and a clinical eight-slice heli- 
cal CT scanner (NeuroLogica Corporation) as previously described?“>“’, 
2-deoxy-2-(8F)fluorodeoxyglucose (FDG) was used as the PET probe. 
Serial scans were performed before, 4 and 8 weeks after Mtb, and before 
necropsy (cohorts 1-3) or at 2 and 4 weeks after BCG (cohorts 5a, b). 
OsiriX MD (v.10.0.1), a DICOM (Digital Imaging and Communications 
in Medicine) image viewer, was used for scan analyses, as described“. 
Lung inflammation was measured as total FDG activity within the lungs. 
Aregion of interest (ROI) was segmented which encompassed all lung 
tissue on CT and was then transferred to the co-registered PET scan. On 
the PET scan, all image voxels of FDG-avid pathology (Standard Uptake 
Value >2.3) were isolated and summated resulting in a cumulative stand- 
ardized uptake value. To account for basal metabolic FDG uptake, total 
FDG activity was normalized to resting muscle resulting in a total lung 
inflammation value. Individual granulomas were counted on each CT 
scan. If granulomas were too small and numerous within a specific area 
to count individually or if they consolidated, quantification was consid- 
ered to be too numerous to count. To measure the volume of the spleen, 
an ROI was drawn outlining the entire organ on each of the axial slices 
of the CT scan and the volume was computed across these ROIs (using 
atool in OsiriX). Any scans for which visibility of the entire spleen was 
limited (n = 2 macaques) were excluded from this analysis. 


Necropsy, pathology scoring and Mtb and BCG burden 
For challenge studies (cohorts 1-3), NHPs were euthanized 11-15 weeks 
after Mtb or at humane endpoint by sodium pentobarbital injection, fol- 
lowed by gross examination for pathology. A published scoring system” 
was used to determine total pathology from each lung lobe (number 
and size of lesions), LN (size and extent of necrosis), and extrapul- 
monary compartments (number and size of lesions). All granulomas 
and other lung pathologies, all thoracic LNs, and peripheral LNs were 
matched to the final PET-CT scan and collected for quantification of 
Mtb. Each lesion (including granulomas, consolidations and clusters 
of granulomas) in the lung, all thoracic LNs, random sampling (50%) 
of each of the 7 lung lobes, 3-5 granulomas (if present) or random 
samples (30%) of spleen and liver, and any additional pathologies were 
processed to comprehensively quantify bacterial burdens. Suspensions 
were plated on 7H11 agar (Difco) and incubated at 37 °C with 5% CO, for 
3 weeks for CFU enumeration or formalin-fixed and paraffin-embedded 
for histological examination. CFUs were counted and summed to cal- 
culate the total thoracic bacterial burden for the macaque”?”48, Mtb 
CFUs for every challenged macaque are listed in Supplementary Table 1. 
To determine BCG CFUs, BAL, bone marrow aspirates, and blood 
were collected from NHPs before euthanasia. Individual lung lobes 
and thoracic and peripheral LNs, spleen, liver, and the skin site(s) of 
injection (if applicable) were excised. 0.5 ml of blood and bone mar- 
row and 10% of retrieved BAL wash fluid were plated; approximately 
1g of tissue (or one whole LN or skin biopsy) was processed in water in 
gentleMACS M Tubes (Miltenyi Biotec) using a gentleMACS Dissocia- 
tor (Miltenyi Biotec). Samples were plated and counted as above. Data 
are reported as CFUs ml" of blood or bone marrow, CFUs per total 
BAL collected, CFUs per one LN or skin biopsy, CFUs per lung lobe or 
spleen. CFUs from individual lung lobes and LNs of the same category 
(for example, hilar) were averaged for each NHP. 
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Rhesus blood, BAL and tissue processing 

Blood PBMCs were isolated using Ficoll-Paque PLUS gradient separa- 
tion (GE Healthcare Biosciences) and standard procedures; BAL wash 
fluid (3 x 20 ml washes of PBS) was centrifuged and cells were combined 
before counting, as described”*. LNs were mechanically disrupted and 
filtered through a 70-ym cell strainer. Lung and spleen tissues were 
processed using gentleMACS C Tubes and Dissociator in RPMI 1640 
(ThermoFisher Scientific). Spleen mononuclear cells were further sepa- 
rated using Ficoll-Paque. Lung tissue was digested using collagenase, 
Typel (ThermoFisher Scientific) and DNase (Sigma-Aldrich) for 30-45 min 
at 37 °C with shaking, followed by passing through a cell strainer. Single- 
cell suspensions were resuspended in warm R10 (RPMI 1640 with 2mM 
L-glutamine, 100 U mI" penicillin, 100 pg ml“ streptomycin, and 10% 
heat-inactivated FBS; Atlantic Biologicals) or cryopreserved in FBS 
containing 10% DMSO in liquid nitrogen. 


Multiparameter flow cytometry 

Generally, longitudinal PBMC samples were batch-analysed for 
antigen-specific T cell responses or cellular composition at the end 
of the study from cryopreserved samples whereas BAL and tissue 
(necropsy) samples were analysed fresh. Cryopreserved PBMC were 
washed, thawed and rested overnight in R10 before stimulation, as 
described’. For T cell stimulation assays, 1-5 million viable cells were 
plated in 96-well V-bottom plates (Corning) in R10 and incubated with 
R10 alone (background), or with 20 pg mI tuberculin PPD (Statens 
Serum Institut, Copenhagen, Denmark), 20 pg ml! H37Rv Mtb WCL 
(BEI Resources), or 1 pg ml each of ESAT-6 and CFP-10 peptide pools 
(provided by Aeras, Rockville, MD) for 2h before adding 10 pg mI“ BD 
GolgiPlug (BD Biosciences). The concentrations of PPD and WCL were 
optimized to detect CD4 T cell responses; however, protein antigen 
stimulation may underestimate CD8 T cell responses. For logistical 
reasons, cells were stimulated overnight (14 h total) before intracel- 
lular cytokine staining. For cellular composition determination, cells 
were stained immediately ex vivo after processing or after thawing. 
Antibody and tetramer information for each flow cytometry panel 
is listed in Supplementary Data 8-11. Generally, cells were stained as 
follows (not all steps apply to all panels, all are at room temperature): 
Washed twice with PBS/BSA (0.1%); 20-min incubation with rhesus 
MR1tetramer* (NIH Tetramer Core Facility) in PBS/BSA; washed twice 
with PBS; live/dead stain in PBS for 20 min; washed twice with PBS/ 
BSA; 10-min incubation with human FcR blocking reagent (Miltenyi 
Biotec); incubation with surface marker antibody cocktail in PBS/BSA 
containing 1 Brilliant Stain Buffer Plus (BD Biosciences) for 20 min; 
washed three times with PBS/BSA (0.1%); 20 min incubation BD Cytofix/ 
Cytoperm Solution (BD Biosciences); washed twice with Perm/Wash 
Buffer (BD Biosciences); 30 min incubation with intracellular antibody 
cocktail in Perm/Wash Buffer containing 1x Brilliant Stain Buffer Plus; 
washed thrice with Perm/Wash Buffer. For Ki-67 staining, samples were 
stained for surface markers and cytokines as described above, followed 
by nuclear permeabilization using eBioscience Foxp3/Transcription 
Factor Staining Buffer (ThermoFisher Scientific) and incubation with 
antibody against Ki-67 following kit instructions. Data were acquired 
oneither a modified BD LSR II or modified BD FACSymphony and ana- 
lysed using FlowJo software (v.9.9.6 BD Biosciences). Gating strategies 
can be found in Supplementary Data 8-11. All cytokine data presented 
graphically are background-subtracted. 


Intravascular CD45 staining 

One month after BCG vaccination, macaques in each cohort 6 (n=2 
macaques per group) received an IV injection of Alexa Fluor 647-con- 
jugated anti-CD45 antibody (ivCD45; 60 pg kg, clone MB4-6D6, 
Miltenyi Biotec) 5 min before euthanasia. Blood was collected before 
anti-CD45 injection as a negative control, and before euthanasia as a 
positive control. NHPs underwent whole body perfusion with cold 


saline before tissue collection. Tissues were processed for BCG CFU 
quantification and flow cytometric analysis as described above. Stain- 
ing panels used were asin Supplementary Data 9, with the omission of 
the APC-conjugated antibodies. 


Immunohistochemistry 

Embedded tissue sections were deparaffinized (100% xylenes, 10 min; 
100% ethanol, 5 min; 70% ethanol, 5 min), boiled under pressure for 6 
min in antigen retrieval buffer (1 Tris EDTA, pH 9.0), and cooled. Sec- 
tions were blocked in PBS (1% BSA) in a humidified chamber at room 
temperature for 30 min followed by staining for CD3 (CD3-12, Abcam), 
CD11c (5D11, Leica), and CD20 (Thermo Scientific, RB-9013-PO) for18h 
at 4 °Cina humidified chamber. After washing with PBS in coplin jars, 
sections were incubated for 1 hat room temperature with conjugated 
anti-rabbit IgG Alexa Fluor 488 (Life Technologies, A21206), anti-rat IgG 
Alexa Fluor 546 (Invitrogen, A11081), and anti-mouse IgG Alexa Fluor 
647 (Jackson ImmunoResearch, 7 5606-150). After washing, coverslips 
were applied using Prolong Gold anti-fade with Dapi mounting media 
(Life Technologies). Slides were cured for 18-24 h before imaging onan 
Olympus FluoView FV1000 confocal microscope. Lung sections were 
imaged and two random representative 1 mm? ROIs from each macaque 
were analysed using CellProfilerv2.2.0. Pipelines were designed for 
analysis by adding modules for individual channel quantification based 
on pixel intensity and pixel size providing a numerical value for each 
cell type and total cells. Histological analyses were performed bya 
veterinary pathologist (E.K.) ina blinded fashion on H&E-stained sec- 
tions from all tissues obtained. 


ELISpot and Luminex 

IFNy ELISpots were performed at O, 4, 6 and 8 weeks after Mtb and at 
necropsy. One day before use, hydrophobic high protein binding mem- 
branes 96-well plates (Millipore Sigma) were hydrated with 40% ethanol, 
washed with sterile water, and coated with anti-human/monkey IFNy 
antibody (15 pg ml“, MT126L, MabTech) overnight at 4 °C. Plates were 
washed with HBSS and blocked with RPMI with 10% human AB serum for 
2 hat 37 °C with 5% CO,. Approximately 200,000 PBMCs per well were 
incubated in RPMI supplemented with L-glutamate, HEPES and 10% 
human AB serum containing 2 pg ml ESAT-6 or CFP-10 peptide pools 
for 40-48 hat 37 °C with 5% CO,. Medium alone or phorbol 12,13-dubu- 
tyrate (12.5 pg mI“) plus ionomycin (37.5 pg ml”) were added as negative 
(background) and positive controls, respectively. To develop, plates 
were washed with PBS and biotinylated anti-human IFNy antibody 
(2.5 pg mI, 7-B6-1, MabTech) was added for 2h at 37 °C with 5% CO,. After 
washing, streptavidin-horseradish peroxidase (1:100, MabTech) was 
added for 45 min at 37 °C with 5% CO,. Spots were stained using AEC per- 
oxidase (Vector Laboratories, Inc.) per the manufacturer’s instructions 
and counted manually onan ELISpot plate reader. Data are reported as 
average ELISpots from duplicate background-subtracted wells. Wells 
with confluent spots were described as too numerous to count. 

To measure innate cytokine production following BCG immuniza- 
tion, cryopreserved PBMC were batch-analysed. Cells were thawed 
and resuspended in warm R10. Then, 5 x 10° cells per well in 96-well 
V-bottom plates were rested overnight at 37 °C with 5% CO,. Cells were 
resuspended in Trained Immunity Media’ plus H37Rv Mtb whole cell 
lysate (BEI Resources, 20 pg ml’), heat-killed Staphylococcus aureus 
(InvivoGen, 110° per ml), Escherichia coli LPS (Sigma-Aldrich, Ing mI"), 
or RPMI and incubated for 24 h at 37 °C with 5% CO, before collecting 
supernatants. Cytokine and chemokine measurements were determined 
using a MILLIPLEX NHP cytokine multiplex kit per instructions (Millipore 
Sigma) and analysed ona Bio-Plex Magpix Multiplex Reader (Bio-Rad). 


Antibody ELISAs 

IgG, IgA and IgM titres to Mtb H37Rv WCL were assessed in plasma and 
tenfold concentrated BAL fluid. WCL was used based on greater sensitiv- 
ity compared to PPD, culture filtrate protein, or lipoarabinomannan. 


96-well MaxiSorp ELISA plates (Nunc) were coated overnight at 4 °C 
with 0.1 pg of WCL. Plates were blocked with PBS/FBS (10%) for 2h at 
room temperature and washed with PBS/TWEEN 20 (0.05%). 1:5 serially 
diluted plasma or concentrated BAL fluid (8 dilutions per sample) was 
incubated at 37 °C for 2 h, followed by washing. Then, 100 ul of goat 
anti-monkey HRP-conjugated IgG h+l (50 ng mI‘; Bethyl Laboratories, 
Inc.), IgA «chain (0.1p1g mI, Rockland Immunochemicals Inc.), orlgM 
a chain (0.4 pg mI“, Sera Care) was added for 2h at room temperature, 
followed by washing. Ultra TMB substrate (100 pl; Invitrogen) was 
added for 12 min followed by 100 p12 N sulfuric acid. Data were collected 
ona Spectramax i3X microplate reader (Molecular Devices) at 450 nm 
using Softmax Pro and presented either as endpoint titer (reciprocal 
of last dilution with an OD above the limit of detection or 2x the OD of 
an empty well) at 0.2 for IgG and IgA, or midpoint titer for IgM where 
samples did not titre to a cut off of 0.2. 


Single-cell transcriptional profiling 

High-throughput single-cell mRNA sequencing by Seq-Well was per- 
formed on single-cell suspensions obtained from NHP BAL, as previ- 
ously described”. Approximately 15,000 viable cells per sample were 
applied directly to the surface of a Seq-Well device. At each time point 
after BCG, two arrays were run for each sample—one unstimulated and 
one stimulated overnight with 20 pg ml of PPD in R10. 


Sequencing and alignment. Sequencing for all samples was per- 
formed onan Illumina Nova-Seq. Reads were aligned to the M. mulatta 
genome using STAR”, and the aligned reads were then collapsed by 
cell barcode and unique molecular identifier (UMI) sequences using 
DropSeq Tools v.1to generate digital gene expression (DGE) matrices, 
as previously described”*™. To account for potential index swapping, 
we merged all cell barcodes from the same sequencing run that were 
within a hamming distance of 1. 


Analysis of single-cell sequencing data. For each array, we assessed 
the quality of constructed libraries by examining the distribution of 
reads, genes and transcripts per cell. For each time point, we next per- 
formed dimensionality reduction (PCA) and clustering as previously 
described’. We visualized our results in a two-dimensional space 
using UMAP™, and annotated each cluster based on the identity of 
highly expressed genes. To further characterize substructure within cell 
types (for example, T cells), we performed dimensionality reduction 
(PCA) and clustering over those cells alone as previously described”. 
We then visualized our results in two-dimensional space using 
t-distributed stochastic neighbour embedding (t-SNE)”*. Clusters 
were further annotated (that is, as CD4 and CD8 T cells) by cross- 
referencing cluster-defining genes with curated gene lists and online 
databases (that is, SaVanT andGSEA/MsigDB)*>™”. 


Module identification. Data from stimulated or unstimulated T cells 
at week 13 or 25 was subset on significant principal components as 
previously described” and, for those principal components, on genes 
with significant loadings as determined through a randomization 
approach (‘JackStraw’)”. These matrices were then used as the inputs 
for WGCNA®. Following the WGCNA tutorial (https://horvath.genetics. 
ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/Tutori- 

als/), we chose an appropriate soft power threshold to calculate the 
adjacency matrix. As SCRNA-seq data is affected by transcript drop-out 
(failed capture events), adjacency matrices with high power further 
inflate the effect of this technical limitation, and yield few correlated 
modules. Therefore, when possible, we chose a power as suggested by 
the authors of WGCNA (that is, the first power with a scale free topol- 
ogy above 0.8); however, if this power yielded few modules (fewer than 
three), we decreased our power. We then generated an adjacency matrix 
using the selected soft power and transformed it into a topological 
overlap matrix (TOM). Subsequently, we hierarchically clustered this 


TOM, and used the cutreeDynamic function with method ‘tree’ toiden- 
tify modules of correlated genes using a dissimilarity threshold of 0.5 
(that is, acorrelation of 0.5). To test the significance of the correlations 
observed in each module, we implemented a permutation test. Binning 
the genes in the true module by average gene expression (number 
of bins = 10), we randomly picked genes with the same distribution of 
average expression from the total list of genes used for module discov- 
ery 10,000 times. For each of these random modules, we performeda 
one-sided Mann-Whitney U-test between the distribution of dissimi- 
larity values among the genes in the true module and the distribution 
among the genes in the random module. Correcting the resulting P 
values for multiple hypothesis testing by Benjamini-Hochberg false 
discovery rate correction, we considered the module significant if fewer 
than 500 tests (P< 0.05) had false discovery rate > 0.05. 


Gene module enrichments. To characterize the seven significant 
gene modules identified among in vitro-stimulated T cells collected 13 
weeks after vaccination, we performed an enrichment analysis using 
databases of gene expression signatures (SaVanT and GSEA/MsigDb). 
Specifically, the enrichments in the Savant database, which includes 
signatures from ImmGen, mouse body atlas and other datasets (http:// 
newpathways.mcdb.ucla.edu/savant-dev/), were performed using 
genes included in significant modules with a background expression 
set of 32,681 genes detected across single cells using Piano (https:// 
varemo.github.io/piano/). 


Statistical methods 

Allreported Pvalues are from two-sided comparisons. For continuous 
variables, vaccine routes were compared using a Kruskal-Wallis test 
with Dunn’s multiple comparison adjustment or one-way ANOVA with 
Dunnett’s multiple comparison adjustment (comparing all routes to 
ID, BCG). Fisher’s exact tests were run for multiple CFU thresholds 
(evaluating protection) to assess the association between vaccine route 
and protection from Mtb (Extended Data Fig. 8b). A permutation test® 
was used to compare fractional distributions (pie charts) of all vaccine 
groups toID,.y BCG. For clinical parameters, combined pre-vaccination 
measurements from all NHPs were compared against distributions from 
every vaccine group at every time point using Dunnett’s test for multiple 
comparisons. To assess whether post-vaccination antigen-responsive 
CD4 or CD8 T cells in the BAL or PBMCs are associated with disease 
severity, we first calculated peak T cell responses for each NHP over 
the course of vaccine regimen. The log,,-transformed CD4 and CD8 
cell counts were calculated within BAL and frequencies of CD4 and 
CD8 cells were calculated within PBMCs. To assess the effects of vac- 
cine route and T cells onlog,,-transformed total CFUs, several multiple 
linear regressions were run inJMP Pro (v.12.1.0). Peak T cell responses 
and CFUs for each macaque included in these analyses are provided in 
Supplementary Table 1; detailed regression output (including model 
fit, ANOVA results, effect tests and parameter estimates) is provided 
in Supplementary Table 5. Cytokine production for trained immunity 
assay was compared using a two-way ANOVA and Dunnett’s multiple 
comparison test. Serial PBMC responses to CFP, ESAT-6 or CFP-10 by 
IFNy ELISpot were analysed by using a Wilcoxon signed-rank test to 
compare pre-infection versus 12 weeks post-infection time points 
(within each vaccine route). 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 

All relevant data are available from the corresponding author upon 
reasonable request. Supplementary Table 1 provides peak immune data 
and post-challenge data for individual NHPs and Supplementary Table 5 
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provides regression analyses that support Extended Data Fig. 13. Sup- 
plementary Tables 2-4 include stimulation-inducible module genes, 
gene enrichments for modules, and differentially expressed genes that 
support transcriptional profiling data. RNA-sequencing data that sup- 
port this study have been deposited in the Gene Expression Omnibus 
(GEO) under accession number GSE139598. Source Data for Figs. 1-4 
and Extended Data Figs. 2-13 are provided with the paper. 


Code availability 
All Rcode used for analysis of Seq-Well data is available upon request. 
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Extended Data Fig. 1| Study design, vaccine regimens, macaques and 
cohorts.a, Vaccine groups including route of BCG administration, target dose 
of BCGto be delivered (CFUs), and number of NHPs per BCG regimen (n=10 
macaques except ID, i.,,=8 macaques). Note that ID,,y and IDyig, roups 
received BCGin one ortwosites, respectively, and AE/ID group received AE 
(high-dose) and ID (low-dose) BCG simultaneously. Unvaccinated macaques 
(n=4) were used as Mtb challenge controls. b, Timeline for Mtb challenge 
cohorts including weeks relative to BCG vaccination for PBMC and BAL sample 
collection, Mtb challenge, PET-CT scanning, and scheduled necropsy after 
challenge. Macaques that met humane end-point criteria were euthanized 
earlier than 12 weeks post-challenge (Supplementary Table 1).c, Dataare from 
atotal of 115 rhesus macaques, 52 of which were challenged with Mtb. Owing to 
the ABSL-3 capacity constraints and logistical limits in the number of macaques 
that canbe sampled, scanned by PET-CT, or necropsied at any given time point, 
studies were broken into sequentially immunized and/or challenged cohorts. 
Amaximum of 20 NHPs were infected with Mtb in any challenge cohort with 
infections split over 2 days, staggered by 2 weeks. The actual doses of BCG 
administered, determined by subsequent culture, is noted for each vaccine 
group. The time interval between vaccination and challenge is noted in weeks 
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and the challenge dose of Mtb (CFUs) is listed for each challenge cohort (BCG 
vaccine dose and Mtb challenge dose for individual NHPs, along with peak 
immune responses and detailed outcome data, is provided in Supplementary 
Table 1). Protection data are from 8-10 BCG-immunized NHPs per group and 4 
unvaccinated controls in cohorts 1-3 (‘Immunology & challenge’). Per protocol, 
BAL samples were not collected from animals 8 weeks before, or after, Mtb 
challenge. Three NHPs per vaccine group were immunized just asin cohorts 
1-3 but were not challenged. Instead, these macaques (cohort 4; ‘Immunology 
only’) were sampled (BAL, PBMC) for 6 months after BCG immunization and 
then euthanized to perform extensive immune analysis in various tissues 

at what would have been the time of challenge. BAL samples from cohort 4 
were transcriptionally profiled at weeks 13 and 25. Cohort 5 (a-c) includes 

4 macaques per group (except AE and AE/ID groups, n=2 NHPs each) that 
were immunized with BCG and were euthanized 1-3 months later to assess 
BCG CFUsand T cell responses in various tissues. NHPs in cohort 6 (‘ivCD45’, 

n=2 macaques per group) received anti-CD45 injection before necropsy to 
distinguish blood- and tissue-derived cells. Pilot cohorts (a—c) include NHPs 
enrolled inthe dose-finding pilot study (n=3 macaques per dose and route; 
‘Immunology pilot’). 
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Extended Data Fig. 2| Clinical parameters after BCG vaccination in NHPs. To 
assess safety of BCG vaccinations, all macaques (cohorts 1-4 excluding 
unvaccinated) were monitored for changes in several clinical parameters at 
various time points after BCG. After vaccination, changes were observed 
predominantly inIV BCG macaques; however, all were transient. a, Weight and 
temperature: there was a 0.9 °C increase in body temperature intheIV BCG 
group at day 1, which resolved by day 2; the average pre-vaccination 
temperature across all NHPs was 38.4 °C. b, Liver function tests (alanine 
aminotransferase (ALT), aspartate aminotransferase (AST), albumin and 
globulin): there was a twofold increase in ALT and AST above pre-vaccination 
levels (20-30 IU!) in the lV BCG group, which resolved by day 28. c, C-reactive 
protein (CRP) inthe lV BCG group increased up toa median of 400 pg mI ‘at day 
2, which resolved by day 14; the average pre-vaccination CRP level in plasma 


IDiow 


IDpigh IV AE AE/ID 


across all NHPs ranged from 0 to 28 pg mI“. d, Complete blood counts (CBC). 
Transient increases in numbers of circulating neutrophils (day 1) and 
lymphocytes, monocytes and basophils (day 7) were observed inthe lV BCG 
group. Alltests were performed longitudinally on whole blood at time of 
collection except CRP, which was batch-analysed from frozen plasma samples; 
the 6-htime point was measured for CRP only. Data points shown are individual 
NHPs (n=11-13 per group, n= 63 total) with interquartile range (box) and 
median (line). For each parameter, pre-vaccination (P) measurements for all 
NHPs were combined and compared against distributions from every vaccine 
groupat every time point using Dunnett’s test for multiple comparisons; 
*P<0.05.Noclinical signals, suchas lethargy, appetite suppression or weight 
loss, were observed up to time of Mtb challenge, 24 weeks later. 
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Extended Data Fig. 3| See next page for caption. 
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Extended Data Fig. 3| Proportions of leukocyte and T cell subsets in the BAL 
and PBMCs after BCG immunization. a—d, We assessed whether the 
composition of leukocytes inthe BAL or PBMCs wasaltered after BCG 
vaccination. Shown are pie graphs comprising proportions of indicated 
leukocytes (a, c) or CD3*T cell subsets (b, d) in BAL (a, b) and PBMCs (c, d) for 
each BCG regimen from pre-vaccination up to 24 weeks post-BCG, identified 
using multi-parameter flow cytometry as in Supplementary Data 8.a, Inthe 
BAL, the rapid and sustained increase in T cell (but not macrophage) number 
(Fig. 1aand Supplementary Data 2b) altered the overall cellular composition of 
BAL from approximately 75% alveolar macrophages (red) and 15% T cells (blue) 
before vaccination to approximately 65% T cells and 30% macrophages, even 

6 months after IV BCG. b, To delineate the composition of BAL T cells further, 
the proportions of CD4 and CD8 T cells, as well as non-classical T cells (y6, MAIT 
and iNKT) that may also have a role in protection against TB’ “ were assessed. 


Two weeks after vaccination, there was a substantial but transient increase in 
the proportion of Vy9* y6 T cells and MAIT cells after IV BCG, anda trend 
towards increased Vy y6 T cells and MAIT cells after BCG ID,igy. However, by 8 
weeks, the proportions of these non-classical T cells contracted to pre- 
vaccination levels. c,d, A similar analysis was performed to determine howthe 
route of BCGimmunization influenced the composition of leukocytes in 
PBMCs. Here, IV BCG induced atransient increase in Vy9 y6 T cells but not 
MAIT cells. BAL pie graphs represent the average proportions from 13 NHPs per 
BCG regimen (cohorts 1-4; Extended Data Fig. 1c) except where indicated 
(white numbers in aalso apply to b). PBMC pie graphs represent the average 
proportions from three NHPs per BCG regimen (cohort 4). B, Bcells; Mg, 
macrophages; Mono, monocytes; T, T cells; Neut, neutrophils. Pvalues indicate 
differences compared to pre-vaccination within the same vaccine group using 
a Permutation test. 
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Extended Data Fig. 4| See next page for caption. 
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Extended Data Fig. 4 | Extended immune data from challenge and 
immunology cohorts. a, b, Full kinetics of PBMC responses from NHPs in 
challenge cohorts (cohorts 1-3, n= 8-11 macaques) as in Fig. 1b, c. Shownis the 
frequency of memory CD4 (a) or CD8 (b) T cells producing any combination of 
IFNy, IL-2, TNF or IL-17in response to PPD stimulation at various time points 
before and up to 24 weeks after BCG. Grey lines are individual NHP responses; 
bold, coloured lines represent the median response. Each group was compared 
toID,,,, at weeks 4 and 24 (one-way ANOVA; Pvalues are Dunnett’s multiple 


comparison test).c,d, T cell responses froma replicate cohort of similarly BCG- 


immunized rhesus macaques (cohort 4, n =3 NHPs) from which BAL was 
collected for 24 weeks after BCG vaccination. Shown is the frequency (top) or 
absolute number (log,)-transformed; bottom) of CD4 (c) or CD8 (d) memory 
Tcells expressing any combination of IFNy, IL-2, TNF or IL-17 in response to PPD 
stimulation, before and up to 24 weeks after BCG vaccination. Kruskal-Wallis 
test was used to compare each group to ID,,y at weeks 8 (peak) and 24 (time of 
challenge); Pvalues are Dunn’s multiple comparison test. e, f, The memory 
phenotype of antigen-responsive CD4 (e) and CD8 (f) T cells in PBMCs and BAL 
at the peak of the response (week 4 for PBMC, week 8-12 for BAL; cohorts 1-3, 
n=8-10 macaques) and time of challenge (week 24 collected for PBMC only) 
was assessed. Cytokine-positive T cells from PBMCs were categorized as 
central memory (T,y), Tr, effector memory (T;y), or terminal effectors (T;,) 
based onexpression of CD45RA, CD28 and CCR7 as shown in Supplementary 


Data10. Most responding cells in PBMCs were central memory and transitional 
memory T cells, with the proportion of transitional memory cells greater in 

IDy ign" and IV-BCG-immunized NHPs compared with the ID,,,, group. In BAL, 
where T cells are CCR7-negative, most responding CD4 T cells were 

CD45RA CD28* T;y,cells (Supplementary Data 9). For CD8 memory 
phenotypes, pie graphs are shown only for groups that displayed measurable 
frequencies of cytokine* CD8 T cells. 1V-BCG-immunized NHPs had larger 
proportions of T,y, cells in PBMCs and BAL, which suggests a more diverse 
composition of memory and effector cells than other routes. Pvalues indicate 
differences compared to ID,,.y using a permutation test (CD4 pie graphs only). 
g-j, PBMCT cell responses fromareplicate cohort of similarly BCG-immunized 
rhesus macaques (cohort 4, n=3 macaques). Shown is the frequency of CD4 

(g) and CD8 (h) memory T cells producing any combination of IFNy, IL-2, TNF or 
IL-17 inresponse to PPD stimulation before and up to 24 weeks after BCG. i,j, As 
animmunological indicator of recent antigen exposure and proliferation due 
to BCG persistence in vivo, Ki-67 expression in PBMCs over the course of 
immunization was assessed. Shown is the percentage of cytokine-positive 
(closed symbols, solid lines) or cytokine-negative (open symbols, dashed lines) 
memory CD4 or CD8T cells expressing Ki-67 as identified in Supplementary 
Data 11. InIV-BCG-immunized NHPs, at least 60% of antigen-responsive CD4 
Tcellsin blood were Ki-67* at 2and 4 weeks after BCG but were at baseline 

6 months later. 
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Extended Data Fig. 5| See next page for caption. 
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Extended Data Fig. 5| Quality of T cell responses in PBMCs and BAL after BCG 
immunization. The composition of the cytokine responses at the single-cell 
level, or ‘quality’ of the response, can reveal distinct functional differences that 
associate with protection against Mtb and other pathogens””°. Here, the 
quality was defined by the relative proportion of antigen-stimulated cells 
producing every combination of IFNy, IL-2 and TNF, with or without CD154 or IL- 
17. CD154 (also known as CD40L) expression in PBMCs was measured asa 
sensitive marker for detection of all antigen-stimulated CD4 T cells” based on 
evidence for CD4-dependent, IFNy-independent mechanisms of protection 
against TB*”’. Shown are peak PPD-responsive memory CD4 and CD8 T cell 
responses in PBMCs (a, week 4) or BAL (b, week 12) after BCG vaccination for 
challenge cohorts 1-3 (n= 8-11 NHPs); analysis of all time points is shown in 
Supplementary Data 4 and 5.a, Bar graphs show the frequency of T cellsin 
PBMCs expressing CD154 with IFNy, IL-2, or TNF production, and total IL-17 
production (CD4 response, top) or IFNy, IL-2, or TNF for the CD8 response 
(bottom). Individual NHP responses are shown with interquartile range (bar) 


and median (horizontal line). Pie graphs represent the proportion of the total 
response comprising each cytokine combination, averaged for all NHPs, and 
are not shown for groups with low to undetectable responses. The proportion 
of the response producing IL-17 (with or without other cytokines) is indicated 
witha black arc and the proportion expressing CD154 alone is the black pie 
section. b, Bar graphs show the frequency of CD4 or CD8 T cells in BAL 
producing IFNy, IL-2 or TNF, and total IL-17 production. Pie graphs represent the 
average proportion of total cytokine production comprising each cytokine 
combination; the proportion of the total response producing IL-17 (with or 
without other cytokines) is indicated witha black arc. Despite the notable 
differences in the magnitude of responses amongst BCG regimens, there were 
no differences inthe quality of CD4 T cell responses nor CD8 T cell responsesin 
PBMC or BAL. Of note, approximately 90% of the CD4 T cell responses were 
composed of T,,1 cytokines with fewer than 10% also producing IL-17; most IL-17 
producing CD4T cells co-expressed T,,1 cytokines. 
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Extended Data Fig. 6| See next page for caption. 
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Extended Data Fig. 6 | Identification of gene modules and distribution of 
modulescores. A total of 162,490 single-cell transcriptomes derived from 
unstimulated and PPD-stimulated BAL cells from 15 NHPs (cohort 4,n=3 per 
group) at weeks 13 (peak of BAL response) and 25 (time of challenge) were 
profiled. a, Uniform manifold approximation and projection (UMAP) plots of 
BAL cells at weeks 13 and 25 after BCG immunization, coloured by time point 
(top left), PPD stimulation condition (top right), and cell type (week 13, bottom 
left; week 25, bottom right). b, Gene-gene correlation heat map showing 
significant gene modules (M1-M7; top) identified among week 13 stimulated 


BAL T cells with select genes (right) highlighted. c, t-Distributed stochastic 
neighbour embedding (t-SNE) plots of stimulated BAL T cells from weeks 13 
(left) and 25 (right), coloured by vaccine group (top), T cell subtypes (middle), 
and module 2-positivity (bottom). d, Histograms of the distribution of module 
2 scores by vaccine group (colour) and macaque. Dashed line (placed at twos.d. 
above the meanscore inthe naive controls) indicates the threshold used to call 
cells as positive for the module. The percentage module 2-positive is shown for 
each NHP. 
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Extended Data Fig. 7 | Humoral immune response in BAL and plasma after 
BCG immunization. Mtb-responsive antibody responses were assessed in BAL 
and plasma after BCG immunization. Mtb WCL-specific IgG, IgA and IgM 
antibody titres were measured from individual NHPs at various time points 
before and after BCG immunization. Shown are end-point titres for IgG andIgA 
and mid-point titres for IgM (in which the end point was not reached) a, 
Antibody titres in tenfold-concentrated BAL fluid (cohorts 1-4, n=11-13 
macaques except at weeks 2, 20 and 24, cohort 4 only, n=3 macaques). In 


Antibody titers (plasma) 
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< 


IgM 
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concentrated BAL fluid, antigen-responsive IgG, IgA and IgM were detected 
only inIV-BCG-immunized NHPs and returned to pre-vaccination levels by the 
time of challenge. b, Antibody titres in plasma (n=11-13 macaques). In plasma, 
both ID,j.,and IV BCG elicited increased IgG and IgA antibody responses 
compared toID,,y BCG. Data are geometric mean and s.d.; dashed line indicates 
assay limit of detection. A Kruskal-Wallis test was used to compare all vaccine 
groups toID,,,, at weeks 4, 16 and 24 (BAL) or weeks 4 and 24 (plasma); Pvalues 
are from Dunn’s multiple comparison test (colour-coded to vaccine). 
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Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | Post-challenge immune responses to mycobacterial 
antigens. a, PBMC response to ESAT-6 or CFP-10 peptides (antigens presentin 
Mtb but not BCG) as determined by IFNy ELISpot throughout Mtb infection. 
Each line is one NHP over time (n= 8-10 macaques; n=4 unvaccinated); sterile 
animals are represented by atriangle, and non-sterile, protected animals (with 
1< CFUs <50) denoted by squares. After infection, most animals in the AE or ID 
vaccine groups developed ESAT-6 or CFP10 ELISpot responses, which reflects a 
primary response to Mtb. By contrast, responses inthe IV BCG group were 
lower than in the ID,,,, group at every time point after infection for ESAT-6 

(4 weeks, P=0.001; 6 weeks, P= 0.045; 8 weeks, P=0.025; 12 weeks, P= 0.006) 
and CFP-10 (4 weeks, P< 0.0001; 6 weeks, P= 0.035; 8 weeks, P=0.001;12 
weeks, P=0.004). Kruskal-Wallis test was run at each time point with Dunn’s 
adjusted P values reported accounting for comparisons of all groups against 
ID, yw. b,c, The frequency of memory CD4 (b) and CD8 (c) T cells in PBMCs from 


BCG-immunized NHPs (n= 8-10) producing any combination of IFNy, IL-2, TNF 
or IL-17 in response to stimulation with either PPD (antigen present in BCG and 
Mtb; top row) or pooled ESAT-6 and CFP-10 peptides (antigens present in Mtb 
only; bottom row) were measured at the time of challenge (0), and at 4, 8and12 
weeks after Mtb challenge. Measurements from four unvaccinated, infected 
NHPsare included as controls (Unvax, black). Grey lines represent the 
responses of individual animals and bolded, coloured lines are the mean 
responses for each vaccine group. d, Antibody responses post-challenge. Mtb 
WCL-specific IgG, IgA and IgM antibody titres were measured in the plasma of 
unvaccinated (n= 4) and vaccinated (n= 8-10) NHPsat the time of challenge (0), 
and at 4 and 12 weeks after challenge. In b-d, Wilcoxon signed-rank unadjusted 
Pvalues compare cytokine frequencies or antibody titres at week 12 after Mtb 
(or necropsy) to the time of challenge (week 0) within each vaccine group. 
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Extended Data Fig. 10 | Inflammation and gross and histopathological 
assessment after BCG vaccination. a, Serial FDG PET-CT scans at 2and4 
weeks after BCG vaccination showed increased metabolism (surrogate for 
inflammation) localized to the lung LNs (green arrows), lung lobes and spleen 
(yellow arrow) elicited by the IV but not by other routes (cohort 5a, b,n=2 
macaques). Warm colours indicate increased FDG retention; scale represents 
standardized uptake values. NHP ID numbers are listed above each scan; ‘H’ 
denotes the heart. b, Spleen volume was calculated from CT scans at 2 and 4 
weeks after BCG vaccination (n=2 macaques). At these time points, animals 
givenIV BCG had approximately twofold larger spleens than those given ID 
BCG, with AE/ID BCG NHPsalso displaying modestly enlarged spleens. 

c, Thoracic LNs were measured at necropsy, 4 weeks after BCG vaccination 
(n=2 macaques); LNs from IV BCG NHPs were enlarged compared to those from 
ID, NHPs. Kruskal-Wallis test was run; Dunn’s adjusted P values are reported 


comparing each vaccine group to the ID,,.y group. d, e, H&E-stained sections of 
thoracic LNs from vaccinated NHPs (n=2 macaques), 4 weeks after BCG 
vaccination. d, General structure with respect to cortical and medullary 
architecture and appearance was normal in LNs from IDjow, |Dpign and AE/ID 
vaccinated NHPs. The thoracic LNs from the IV-vaccinated macaques 
demonstrated marked follicular lymphoid hyperplasia, with enlarged, 
prominent, variably sized follicles, often with active, expanded germinal 
centres. Original magnification, x4. e, Small, non-necrotizing epithelioid 
histiocytic aggregates (non-necrotizing granulomas, black arrows) were 
abundantly disseminated within thoracic LNs from the IV BCG macaques. Inthe 
AE/ID NHPs, a wide nodal distribution of such lesions was also seen, although 
granuloma numbers and density were substantially less. The IDjig, NHPs had 
only one observable granuloma ina single thoracic LN and in the ID,.y NHPs, no 
such structures were evident. Original magnification, x10. 
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Extended Data Fig. 11| See next page for caption. 
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Extended Data Fig. 11| Immune response to BCG 6 months after vaccination. 
Analysis of tissue T cell responses, lung cell counts, immunohistochemistry, 
splenic volume and PET-CT scans was performed 6 months after BCG 
vaccination. a-c, A separate cohort (cohort 3, n=3 macaques) was vaccinated 
with BCG in parallel to the challenge study with the purpose of assessing 
immune responses in various tissues 6 months after BCG (the time point at 
which macaques would be challenged). a, b, Frequency of memory CD4 (a) and 
CD8 (b) T cells producing any combination of IFNy, IL-2, TNF, or IL-17 in 
response to Mtb WCL stimulation inthe PBMC, spleen, bone marrow, 
peripheral LN, lung LN, lung tissue and BAL. Six months after IV BCG, 
immunized NHPs maintained increased frequencies of antigen-responsive 
Tcellsin spleen, BAL and lung lobes. Individual LN and lung lobe responses 
were averaged per macaque. Data points are individual macaques with symbols 
matched across tissues within a vaccine group; horizontal bar indicates the 
mean response. c, Number of cells recovered per gram of lung tissue for each 
NHP; the increased numbers of total cells observed at 1 month post-BCG 

(Fig. 3d) were not detected at 6 months post-BCG. Data are shownas the 
median of 3 macaques per group (solid symbols, counts from six lung lobes per 
animal are averaged) or as counts for individual lung lobes for each animal 


(open symbols; lobes from the same animal have matched symbols). Kruskal- 
Wallis test was used, and Pvalues represent Dunn’s multiple comparison test 
comparing each vaccine group to the ID,,, group. d, Quantification of CD3’, 
CD20*, CD11c’ cells from two lung sections (matched symbols) from 1-2 
macaques per group using Cell Profiler. e, Representative 1-mm7? lung sections 
from 1-2 macaques per vaccine group were stained with H&E or with antibodies 
against CD3’ T cells (red), CD20* B cells (green), and CD11c* macrophages or 
dendritic cells (blue). Neither the increase innumbers of T cells and CD11c’ cells 
nor the histopathological changes in lung sections from IV-BCG-immunized 
macaques observed at 1 month (Fig. 3e, f) were detected 6 months after BCG 
vaccination. f, Spleen volume was calculated from CT scans of 44 NHPs 
(cohorts 1-3) just before Mtb challenge (6 months after BCG vaccination) and 
was not significantly different among vaccine routes (Kruskal-Wallis test, 
P=0.1643). Dots represent individual animals. g, Axial (top) and coronal 
(bottom) PET-CT scans of two representative macaques (n= 8-10) from each 
vaccine group 6 months after BCG, before Mtb infection. Animal ID numbers 
are shown below each set of scans. No detectable lung inflammation (FDG 
uptake) was observed in macaques from any vaccine group. 
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Extended Data Fig. 12 | See next page for caption. 
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Extended Data Fig. 12 | Determination of immune responses and BCGin 
tissues 1 month after immunization. NHPs (cohort 6, n=2) were immunized 
with5 x10’ BCG CFUs (ID igh (a), IV (b), AE (C), or EB (d)); BCG CFUs and antigen- 
responsive T cells were measured in various tissues 1 month later. Before 
euthanasia, a fluorochrome-conjugated anti-CD45 antibody was injected 
intravenously (ivCD45) such that circulating (intravascular) leukocytes were 
uniformly stained (ivCD45*) while leukocytes in the tissue remained protected 
from staining (ivCD45_). To investigate whether antigen-responsive (IFNy*) 
CD4T cells were located in the ivCD45" lung tissue compartment, cells isolated 


from lung lobes were re-stimulated in vitro with Mtb WCL and analysed by 
intracellular cytokine staining. FACS plots show memory CD4 T cells inall 
tissues collected from one of two NHPs per BCG regimen, organized by type/ 
location (systemic, peripheral LN, lung LN, BAL and lung lobes). The BAL and 
lung responses from the IV BCG NHPs, shown in the bottom row of b, is 
reproduced from Fig. 4b. Pre-infusion PBMC indicates PBMCs isolated from 
whole blood collected just before anti-CD45 injection. Bar graphs show the 
number of BCG CFUs in each respective tissue for each animal (colour-coded by 
vaccine), if detected. 
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Extended Data Fig. 13 | Relationships between peak T cell responsesin BAL 
or PBMCs and total CFUs at necropsy. a-d, Linear regressions were used to 
test whether antigen-responsive CD4 (a, b) or CD8 (c,d) T cell numbers (BAL; 
a,c) or frequencies (PBMC; b, d) after BCG immunization are associated with 
disease severity (total CFUs). Results indicate that when controlling for all 
vaccine routes, peak CD4T cells inthe BAL and PBMC, and peak CD8T cellsin 
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the BAL donot havea significant association with total CFUs (Supplementary 
Table 5a-c). Of note, in PBMCs, higher peak CD8 frequencies are associated 
with lower total CFUs after controlling for route (Supplementary Table Sd). 
Each dot represents an individual animal; coloured lines represent linear fit for 
each vaccine route. Dotted black lines represent linear fit for all vaccine routes 
combined (with 95% confidence interval shaded in grey). 
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RIPK1is a key regulator of innate immune signalling pathways. To ensure an optimal 
inflammatory response, RIPK1is regulated post-translationally by well-characterized 
ubiquitylation and phosphorylation events, as well as by caspase-8-mediated 
cleavage’ ’. The physiological relevance of this cleavage event remains unclear, 
although it is thought to inhibit activation of RIPK3 and necroptosis®. Here we show 
that the heterozygous missense mutations D324N, D324H and D324Y prevent caspase 
cleavage of RIPK1in humans and result in an early-onset periodic fever syndrome and 
severe intermittent lymphadenopathy—a condition we term ‘cleavage-resistant 
RIPK1-induced autoinflammatory syndrome’ To define the mechanism for this 
disease, we generated a cleavage-resistant Ripk1°”™ mutant mouse strain. Whereas 
Ripk1 mice died postnatally from systemic inflammation, Ripk???“" mice died 
during embryogenesis. Embryonic lethality was completely prevented by the 
combined loss of Casp8 and Ripk3, but not by loss of Ripk3 or Mlklalone. Loss of RIPK1 


kinase activity also prevented Rip 


k7?94/3254 embryonic lethality, although the mice 


died before weaning from multi-organ inflammation in a RIPK3-dependent manner. 
Consistently, Ripk IP? and Ripk1?“" cells were hypersensitive to RIPK3- 


dependent TNF-induced apoptosis and necroptosis. Heterozygous Rip 


kc JP325Al* mice 


were viable and grossly normal, but were hyper-responsive to inflammatory stimuli 
in vivo. Our results demonstrate the importance of caspase-mediated RIPK1 cleavage 
during embryonic development and show that caspase cleavage of RIPK1 not only 
inhibits necroptosis but also maintains inflammatory homeostasis throughout life. 


Members of three families presented with a previously undescribed 
autoinflammatory disorder characterized by fevers and pronounced 
lymphadenopathy beginning in early childhood and continuing 
throughout adulthood (Fig. 1a) From birth or shortly thereafter, all 
affected individuals experienced fevers usually occurring approxi- 
mately every 2-4 weeks, lasting 1-7 days, and reaching temperatures 
as high as 40-41 °C. Some individuals reported extreme chills, severe 
headaches, and/or hallucinations that coincided with their fevers. These 
flares were accompanied by intermittent episodes of cervical, axillary, 
inguinal and/or periaortic lymphadenopathy that often caused pain or 
discomfort (Fig. 1b, Table 1). Several individuals experienced spleno- 
megaly and/or hepatomegaly, which were generally more prominent 


early in life, as well as oral ulcers, arthralgia or gastrointestinal symp- 
toms suchas abdominal pain, nausea, diarrhoea, constipation, loss of 
appetite or weight loss (Table 1). Patient 7 (P7) exhibited a more chronic 
inflammation with acute exacerbation. Study participants often had 
increased levels of inflammatory markers even during symptom-free 
periods. In contrast to some more severe autoinflammatory disor- 
ders, there were no signs of rash, arthritis, genital ulcers or end-stage 
organ damage and the condition was not life-threatening in any of the 
patients (Table 1). 

Lymphocyte counts were normal between flares in the seven affected 
participants (Extended Data Table 1). However, pro-inflammatory 
cytokines were increased in the serum from P7 when inflamed but not 


A list of affiliations appears at the end of the paper. 
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during a flare (Fig. 1c). Transcriptomic analysis from P7 whole-blood 
RNA revealed an enrichment of several inflammatory gene signatures 
(Fig. 1d, Extended Data Fig. 1a, b). Affected members of family 2 had all 
taken prednisone during flares, with varying degrees of acute relief but 
without long-term prevention of future episodes (Table 1). Participants 
P1, P6and P7 had tonsillitis (Table 1), but tonsillectomy did not improve 
symptoms. Similarly, the IL-1 receptor antagonist, anakinra, and the 
TNF antagonist, etanercept, did not suppress inflammation in patients 
P1, P2, P4 or P7 (Table 1). However, treatment with the IL-6 receptor 
antagonist tocilizumab markedly, and in some cases severely, reduced 
the severity and frequency of the symptoms of P1, P2, P3, P6 and P7 
(Fig. le, Table 1, Extended Data Table 2a). Tocilizumab also provided 
some initial relief to P4, but P4 reported aggravation of pre-existing 
oral ulcers, and P6 reported eventual onset of hand pain, and both 
participants elected to discontinue treatment (Table 1). 


Identification of pathogenic mutations in RIPK1 


Exome sequencing in P1 and her unaffected parents and all eight mem- 
bers of family 2 revealed that RIPK1 was the only gene in which a variant 
from both families satisfied filtering criteria. A third mutation in R/PK1 
was later discovered in family 3. Affected individuals from the three 
families had different heterozygous missense mutations at the same 
crucial aspartate residue required for RIPK1 cleavage by caspase-8 
(Fig. 1f). The D324N and D324Y mutations occurred de novo in families 1 
(Extended Data Fig. 2) and 3, respectively, whereas D324H was inherited 
in an autosomal dominant pattern in family 2. These mutations are not 
reported in variant databases (Extended Data Table 2b), and none of the 
families had rare co-segregating coding or splice mutations in genes 
previously implicated in autoimmune lymphoproliferative syndrome 
(ALPS) or other monogenic autoinflammatory disorders. Mutations in 
the RIPK1 cleavage site were not found in an additional 554 individuals 
with sporadic unexplained fever, lymphadenopathy, ALPS or idiopathic 
Castleman disease that we screened by Sanger or targeted hybrid cap- 
ture sequencing (Extended Data Table 2c). We therefore designated 
this condition as cleavage-resistant RIPK1-induced autoinflammatory 
(CRIA) syndrome. 

The optimal caspase-8 cleavage motif is highly conserved in verte- 
brates (Fig. 1g, Extended Data Table 3). RIPK1 can be cleaved by both 
caspase-6 and caspase-8, yielding products of similar size, although 
the caspase-6 cleavage site has not been defined’ ™. Consistent with 
these reports, RIPK1 mutants found inthe patients—as well as the D324A 
mutant that has previously° been shown to prevent RIPK1 cleavage by 
caspase-8—were resistant to both caspase-6 and caspase-8 cleavage 
in vitro, which suggests that the cleavage sites of caspase-6 and cas- 
pase-8 are the same (Fig. 1h). 


Lack of RIPK1 cleavage causes embryonic lethality 


To investigate the molecular mechanism for CRIA syndrome and char- 
acterize the role of RIPK1 cleavage in vivo, we generated RIPK1 cleav- 
age-resistant mice. Rather than choosing one of the disease-associated 
variants, we mutated the aspartate to alanine. Although the heterozygous 
Ripk1I’*4* mice were viable and grossly normal, the homozygous Rip- 
kP54/23254 mice died during mid-embryogenesis; much earlier than the 
postnatal lethality of the RipkI mice” © (Fig. 2a, Extended Data Fig. 3a). 
Ripk1??4"°254 lethality occurred between embryonic day 10.5 (E10.5) and 
E11.5, with the embryos showing several sites of mild-to-severe haemor- 
rhage beginning in the cephalic vascular plexus, in the midbrain and 
hindbrain, but ultimately affecting the entire embryo including the phar- 
yngeal arches and the pericardial space (Fig. 2a). At E11.5, all Ripk1°°” 
53254 ambryos were dead and displayed major haemorrhage in several 
locations (Fig. 2a, Extended Data Fig. 3a). E10.5 RipkP??“"4 embryos 
had endocardial cushion hypoplasia, smaller limbs buds and a thinner 
neural retina (Fig. 2b). These developmental delays might be due tothe 
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Fig. 1| Heterozygous mutations of the RIPK1 caspase-8 cleavage site cause 
autoinflammatory disease. a, Affected individuals (filled symbols) in three 
families carried mutations in RIPK1 D324. Crossed symbol indicates a deceased 
individual. b, Axial (top) and coronal (bottom) planes of abdominal 
computerized tomography scans of participant Plat age 11, after 2 months on 
tocilizumab but before substantial resolution of symptoms, revealing 
periaortic lymphadenopathy (arrows), splenomegaly (14 cm craniocaudal 
length), and liver at upper limit of normal (16 cm craniocaudal length).c, Serum 
cytokine levels of two P7 samples taken within 1 week, both during infliximab 
and before tocilizumab treatment, and four unrelated adolescent controls 
(ctrl). Dots are from technical duplicates for each time point. Graphs show 
mean. d, RNA sequencing of whole-blood RNA from P7 (two time points, asin 
c) and two unrelated adolescent unaffected controls, both with technical 
duplicates. Heat map shows differentially expressed inflammatory response 
genes (GO: 0006954). For gene names, see Supplementary Fig. 1.e, Response 
to tocilizumab infusion in P1. Erythrocyte sedimentation rate (ESR), C-reactive 
protein (CRP), haemoglobin and mean corpuscular volume (MCV) were 
measured serially before and after the start of tocilizumab treatment (grey 
shading). Time after the initial evaluation of this subject at age 10 years is 
depicted onthex axis. Horizontal lines indicate high values (ESR and CRP) or 
high and low values (haemoglobin and MCV) for the subject age-specific 
laboratory reference ranges for these markers. RBC, red blood cell count. 

f, RIPKIDNA sequence chromatograms show heterozygous single-base 
substitutions. g, WebLogo demonstrating conservation of the caspase-8 
cleavage tetrapeptide motif in RIPK1 (human numbering) in184 vertebrate 
species. h, In vitro caspase assays on wild-type (WT) and RIPK1 mutants. 
Western blots are representative of two independent experiments. For gel 
source data, see Supplementary Fig. 2. 


defective vasculature, associated with extensive cell death observed 
inthe yolk sac of these embryos (Fig. 2c). This phenotype was reminis- 
cent of several strains of knockout mice with defects in TNF signalling, 
including Casp8“ mice®"*?. The E10.5 lethality of Casp8“ mice is TNF- 
dependent”, and can be prevented by loss of either Ripk3 or MIKE??, 
which suggests that the lethality is due to TNF-induced activation of the 
necroptotic pathway that is normally inhibited by caspase-8. These find- 
ings led to the idea that cleavage of RIPK1 by caspase-8 inhibits necrop- 
tosis during embryogenesis®*””. However, RipkP?"4Ripk3~ mice 
were not viable, consistent with a previous report”. Nevertheless, loss 
of RIPK3 extended survival more than loss of MLKL, whichindicates that 
RIPK3 has anon-necroptotic rolein the early embryonic lethality (Fig. 2d, 
Extended Data Fig. 3b). Combined loss of Casp8 and Ripk3in these mice 
prevented the embryonic lethality, which suggests that caspase-8 
does more than inhibiting RIPK1/RIPK3/MLKL-induced necroptosis 


Table 1 | Clinical features of patients with CRIA syndrome 


Family1 Family 2 Family 3 
Mutation Asp324Asn Asp324His Asp324Tyr 
Patient number P1 P2 P3 P4 P5 P6 P7 
Gender F F M F M F M 
Age at evaluation (years) 10 82 55 54 22 20 13 
Age at onset 2 months Birth 2 weeks Birth Birth Birth 6 months 
Recurrent fevers + + + + + + + 
Fever maximum (°C/°F) 40.5/105 41/106 38.9/102 40.5/105 41/106 41/106 40.5/105 
Fever frequency 1/2 week 1/month 1/3 weeks 1/2 weeks 1/3 weeks 1/2 weeks 1-3/2 weeks 
Fever duration 3-7 days 3 days 3-5 days 2-5 days 2-5 days 3-5 days 1day 
Lymphadenopathy + + + + + + + 
Splenomegaly + = - = + + + 
Hepatomegaly - = - - + + - 
Tonsillitis + - - - - + + 
Abdominal pain + - + - _ + + 
Rash - - - 7 - - a 
Oral ulcers - - + + + + + 
Genital ulcers 7 - - - - - 7 
Arthritis - - - - - - - 
Arthralgia = = + = = + + 
Autoantibodies +ANA +RF 7 - = NA 7 
Response to: 
Prednisone + + + + + + + 
Colchicine NA 7 - 7 = 7 7 
Anti-IL-1R 7 - NA - NA NA - 
Anti-TNF 7 - NA - NA NA + 
Anti-IL-6R + + + +D NA +D + 


Family 2 was first evaluated at the NIH in 1999 for unexplained periodic fever, but the data shown here are from their first return visit after identification of their RIPK1 mutation. For fever fre- 


quency, ‘1/2 weeks’ means once every 2 weeks. 


+, partial or mixed response; ANA, antinuclear antibody; D, discontinued treatment after less than 1 year owing to reported side effects; NA, not applicable; RF, rheumatoid factor. 


at this embryonic stage (Fig. 2d, Extended Data Fig. 3b). Although loss 
of Ripk1 ameliorates the ALPS-like disorder observed in Casp8” 
Ripk3/’ mice®??S, lack of RIPK1 cleavage did not notably affect 
it (Extended Data Fig. 3c, d), consistent with observations in 
Fadd’ Ripk3”’ RipkP?“"4 mice. Interestingly, inhibition of RIPK1 
kinase activity also rescued the embryonic lethality of Ripk1???4"4 
(Fig. 2d). However, Ripk1?38N.23254/0138N.0325A mice were runty and did not 


Ripk1*/* — Ripk193254/03254 c 


Ripk1*/* 


survive past weaning (Fig. 2d, e). These mice had a multi-organ inflam- 
mation presenting with skin hyperplasia, infiltration of leukocytes inthe 
liver andthe lung, disorganized splenicarchitecture and scattered cleaved 
caspase-3-positive cells in these organs (Extended Data Fig. 3e). Loss of one 
allele of Ripk3 or Casp8 prolonged the survival of Ripk1223803254/0138N.03254 
mice to 5 weeks of age, and complete loss of Ripk3 rescued the inflamma- 
tory phenotype of Ripk1229"93254/0138N.03254 mice (Fig, 2d, e). 
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Fig. 2| Homozygous mutation of the RIPK1 caspase-8 cleavage site in mice 
causes early embryonic lethality. a, E10.5 (top) and E11.5 (bottom) embryos, 
representative of four embryos per genotype. FB, forebrain; HB, hindbrain; He, 
heart; FL, forelimb; MB, midbrain; PA, pharyngeal arches. Arrows denote sites 
of haemorrhage. Scale bars, 900 pm (E10.5) and 1,400 pm (E11.5). 

b, Haematoxylin and eosin (H&E)-stained section of E10.5 embryos, 
representative of three embryos per genotype. Arrows denote endocardial 
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cushions (top) and neural retina (middle). Scale bars, 200 pm.c, E10.5 yolk sacs 
stained with anti-PECAMI (cyan) and anti-cleaved caspase-3 (CI. CASP3; 
magenta) antibodies. Images with severely and less severely disrupted 
vasculature are shown. Scale bars, 50 pm. Images are representative of four 
embryos per genotype. d, Diagram depicting the extent of viability of different 
strains of Ripk1°”™ mice. e, Representative pictures of three mice per genotype 
numberedind. 
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Fig. 3 | Ripk1??54"254 and Ripk1°*?>” cells are hypersensitive to TNF-induced 
death. a,c, Cell death of MEFs, monitored by time-lapse imaging of propidium 
iodide (PI) staining over 16h. I denotes 5 1M caspase-8 inhibitor; N denotes 

10 uM necrostatin; NT denotes untreated; R denotes 1 1M RIPK3 inhibitor; S 
denotes 100 nM SMAC mimetic; T denotes 100 ng mI (a) or 10 ng mI“ (c) TNF. 


RIPK1 cleavage limits TNF-induced cell death 


To explore the function of RIPK1 cleavage in TNF signalling, we tested 
homozygous Ripk??“""4 mouse embryonic fibroblasts (MEFs) for 
their response to TNF-induced cell death. Notably, even though TMF is 
not usually cytotoxic, we found that RipkI??”"4 MEFs were sensitive 
to TNF alone and this induced increased phosphorylation of RIPK1, 
as well as activation of caspase-8 when compared to wild-type MEFs 
(Fig. 3a, b). Although inhibiting caspases or RIPK3 kinase activity did not 
affect cell death induced by TNF, genetic loss of RIPK3 or RIPK1 kinase 
activity significantly reduced TNF-induced cell death (Fig. 3a, b). Loss 
of RIPK3 not only completely abrogated death, but also blocked RIPK1 
phosphorylation and caspase activation (Fig. 3a, b). 

Given that the patients contain R/PK1 mutations in only one allele, 
we tested the sensitivity of several RipkI°* heterozygous cell types 
to TNF. In contrast to homozygote Ripk???*“4 MEFs, none of the 
tested Ripk1>*4” cell types were sensitive to TNF alone (Extended 
Data Fig. 4a, b). However, inhibitors that directly activate the cyto- 
toxic activity of RIPK1 (for example, SMAC mimetic, or TAK1, IKK or 
MK2 inhibitors) *°*>”6 rapidly sensitized RipkP’?“”* MEFs and mouse 
dermal fibroblasts (MDFs) to low-dose TNF (Fig. 3c, Extended Data 
Fig. 4a, c). By contrast, only SMAC mimetic and TAK1inhibitor sensitized 
Ripk1?*4* bone-marrow-derived macrophages (BMDMs) to low-dose 
TNF (Extended Data Fig. 4b). In RipkI°°4/">54 MEFs, TNF-induced cell 
death was more pronounced after the addition of IKK or TAK1 inhibitors 
or acombination of SMAC mimetic and MK2 inhibitor (Extended Data 
Fig. 4c). In addition, homozygote and heterozygote RipkI?? MEFs and 
MDFs were slightly more sensitive to apoptosis induced by low-dose 
TNF and cycloheximide (Extended Data Fig. 4a, c). 

Treatment with TNF plus SMAC mimetic induced a strong phos- 
phorylation of RIPK1and RIPK3, as well as activation of caspase-8 and 
caspase-3, in RipkP"* cells (Extended Data Fig. 4d-f), which was more 
pronounced inthe Ripk’”™ homozygote cells (Fig. 3d, Extended Data 
Fig. 4f). This increase in cell death induced by TNF plus SMAC mimetic 
correlated with increased formation of a RIPK1-caspase-8-containing 
complex 2 (Extended Data Fig. 4g). 

Notably, given the increase in caspase-8 activation, loss of RIPK3 
markedly delayed cell death induced by TNF plus SMAC mimetic or 
TAK1, IKK or MK2 inhibitors in both Ripk1?“ homozygote and het- 
erozygote fibroblasts (Fig. 3c, Extended Data Fig. 4a, c). In fibroblasts, 
loss of RIPK3 correlated with significantly reduced autophosphoryla- 
tion of RIPK1 and caspase activation after TNF and SMAC mimetic treat- 
ment (Fig. 3d, Extended Data Fig. 4d, e). However, inhibition of RIPK3 
kinase had little effect on the induction of cell death (Extended Data 
Fig. 4a-c), which suggests that RIPK3 contributes mostly ina structural 
capacity to the activation of caspase-8 in Ripk1”™ cells. 
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Graphs are representative of four independent experiments performed with 
two biological repeats per genotype. b, d, Western blot of MEFs treated asin 
afor2h(b),andas inc for 2h (d). Results are representative of two independent 
experiments. p-RIPK1, phosphorylated RIPK1. B-Actin was used as a loading 
control. For gel source data, see Supplementary Fig. 2. 


We next analysed both Ripk1?**""“ homozygote and heterozygote 
cells and, as expected, genetic loss of RIPK1 kinase activity prevented 
RIPK1 autophosphorylation (Fig. 3d, Extended Data Fig. 4d). It also 
provided some protection from cell death and this effect was mirrored 
by treatment with the RIPK1 inhibitor necrostatin (Fig. 3c, Extended 
Data Fig. 4a—c). Similar to RIPK3 loss, this correlated with reduced 
caspase-8 activation (Fig. 3d, Extended Data Fig. 4d). Together, these 
results indicate that in fibroblasts, RIPK3 promotes caspase-8 activa- 
tion in a manner that is independent of its kinase activity and mostly 
independent of RIPK1 kinase activity (Fig. 3a, b), unless RIPK1is further 
activated by an activating stimulus, such as SMAC mimetic (Fig. 3c, d). 

One surprising observation was that the strong activation of cas- 
pase-8 in Ripk1°™ cells led to RIPK1 cleavage (Fig. 3d, Extended Data 
Figs. 4d, f, 5a). In the case of the heterozygote cells, this was almost 
certainly due to cleavage of the wild-type protein; however, we also 
detected a slightly smaller RIPK1 cleavage product in homozygote cells 
(Fig. 3d, Extended Data Figs. 4f, 5a). This was the result of an alternative 
cleavage site (D301 in mouse) that is as well-conserved as the canonical 
site (Extended Data Fig. 5, Extended Data Table 2b). However, possibly 
owing to the unfavourable hydrophobic amino acid inthe PI’ position”, 
the D301 site was far less efficiently cleaved than the D325 site and only 
when the canonical site was mutated (Fig. 3d, Extended Data Figs. 4f, 5). 


RIPK1 cleavage limits inflammatory responses 


Patients with CRIA syndrome have recurrent fevers, so to understand 
how loss of RIPK1 cleavage might affect the response to inflammatory 
stimuli, we tested the responsiveness of the RipkP*""* mice to Toll-like 
receptor (TLR) ligands. Although there was not a marked difference in 
levels of IL-6, the levels of TNF and IL-1 were higher in the Ripk1°?%4”* 
sera after injection of a non-lethal dose of either lipopolysaccharide 
(LPS) or polyinosinic:polycytidylic acid (poly(I:C)) (Fig. 4a, Extended 
Data Fig. 6a). Similarly, PBMCs from P7 produced more TNF and IL-1B 
after LPS or poly(I:C) treatment (Fig. 4b, Extended Data Fig. 6b). Despite 
these increased levels of cytokines, hypothermia induced by LPS was 
not life-threatening (Extended Data Fig. 6c), which was also consistent 
with the symptoms of the patients with CRIA syndrome. BMDMs also 
produced more TNF after TLR activation (Fig. 4c), which correlated 
with the amount of cell death induced (Extended Data Fig. 6d). 

To define the contribution of the haematopoietic compartment 
to the hyper-inflammatory phenotype, we generated bone marrow 
chimaeras. Notably, both wild-type mice transplanted with Ripkl"* 
haematopoietic cells and RipkP?””"* mice transplanted with wild-type 
bone marrow were hyper-responsive to LPS compared with the controls 
(Fig. 4d). Although our data suggest that the increased inflammatory 
response in mice correlates with increased cell death in Ripk1?*4* 


a Mouse b Human c BMDMs 
Ripkt¥+ Ctrl RIPK1*/* Ripk1*!*_ 
Ripk198254/+ P7 RIPK12324¥/+ Ripk123254/+ 

4, TNE 2.0 =—15, TNF 6, IL-6 ~ 8. TNF 
he 7 > 

= € E 6. 

“3 15 

E E10 4 2 4 

2 1.0 F = 2 

= ici 

E £ § 05 : 

5 = 2 S 

51 0.5 3 50.25 

n = 2 

0 0 2 oO. D 0 


NT LPS Poly(I:C) 

d Bone marrow transplant e 
Ripk1+/+ —» Ripk1+/+ Ripk1*/*—» Ripk1*/* 
Ripk123254/+ = Ripkt++ — Ripk1+/+—w» Ripk123254/+ 


Ripk1*/* 
Ripk192254/* Rijpk129254/03254 Rink3-/-"Casp8~ 


Ripk19254/+ Rink3~-Casp8~ 


sale he gxioe 0.25 INE = TNF = 734 

& a e 0.20 : 15 PS Bx 108 

2 06 0.15 2 

E E 1.0 

@ 0.4 2 © 
0.5 

is 0.05 a 

0. 0 ols 
Oh 2h Oh 2h Oh 2h Oh 2h 


Fig. 4| RIPK1 cleavage limits inflammation in vivo. a, Serum cytokine levels 
after 2h treatment with 2mg kg™LPS. Data are mean +s.e.m.,n=3 mice for TNF 
andn=5 mice for IL-6 and IL-1f. b, Cytokine levels in the supernatant of two 
unrelated adolescent controls (RIPKI**) and P7 RIPK1>>**** PBMCs treated for 
3hwith10 ng mI"LPS. Data are mean of triplicates. c, TNF levels inthe 
supernatant of BMDMs treated for 24 hwith25ngmI"LPS or 2.5 ug mI 
poly(I:C). Data are mean +s.e.m.,n=3 for RipkI”* and n=3-4 for Ripk1??". 

d, Serum TNF levels in wild-type mice reconstituted with Ripk1??4* 
haematopoietic cells (left) or Ripk1°”" mice reconstituted with wild-type 
haematopoietic cells (right), treated for 2h with 2mgkg™LPS. Data are 

mean +s.e.m.,n=3 and 4 RipkI”* > Ripk1”*,n=6 RipkP"* > RipkI”, n=3 for 
Ripki'"* > Ripk1°°**4* mice per genotype. e, Serum cytokines levels after 2h 
treatment with 2 mg kg“ of LPS. Data are mean +s.e.m.,n=4 for Ripk1??™"*, n=5 
for the other genotypes. Results ina, cand eare representative of two 
independent experiments. Each dot ina and c-erepresents a mouse. All P 
values determined by unpaired, two-tailed f-test. 


cells, RIPK1 also contributes to the activation of NF-kB and MAPK 
signalling pathways*?* °°, However, loss of RIPK1 cleavage did not 
affect TNF-induced NF-kB or MAPK activation in either mouse cells 
or patient-derived dermal fibroblasts (Extended Data Fig. 6e-g). 
Furthermore, the cytokine increases observed in the Ripk1??*4”* 
sera were dependent on RIPK3 and caspase-8, which suggests that cell 
death is the major contributor to cytokine induction in these mice 
(Fig. 4e). 

RIPK1 has a role in activating NF-KB and MAPK inflammatory path- 
ways, caspase-8-mediated apoptosis and RIPK3-dependent necropto- 
sis. Each of these distinct responses can contribute to inflammatory 
signalling and it has been difficult to disentangle which pathway causes 
inflammation in any given physiological situation. We describe ahuman 
autoinflammatory disorder caused by heterozygous mutations in RIPK1 
seemingly constrained toa single, evolutionarily conserved aspartate 
residue at the caspase-6/8 cleavage site. Mutation of this key aspartate 
prevents caspase-6/8 cleavage of RIPK1, sensitizes cells to TNF-induced 
cell death and causes embryonic lethality in homozygous mice. Sev- 
eral mechanisms inhibit cell death after TNF stimulation’ ’”° and 
our dataemphasize howimportant this is in limiting an inflammatory 
response. Pathogens may counter cell-death-mediated inflammation 
by expressing caspase-8 inhibitors and a cellular defensive mecha- 
nism that amplifies the cell death response in the absence of RIPK1 
cleavage makes intuitive sense, and may explain why some pathogens 
also attempt to cleave RIPK1”’. Previously, pathogen inhibition of 
caspase-8 was thought to unleash the necroptotic pathway; however, 
RIPK1 cleavage not only limits necroptosis, as previously assumed, 
but can also limit caspase-8-mediated apoptosis. Furthermore, the 
kinase activities of RIPK3 and RIPK1 have mainly been thought of as 
activators of necroptosis. However, the rescue of the postnatal lethal 
phenotype of the Ripk1?*"" mice by loss of Casp8 or Ripk3 reveals a 
far more complex interaction between these molecules than previously 


anticipated. Our data provide support for the concept of a hierarchy 
of preferred responses to TNF signalling: cell survival, then caspase- 
8-mediated apoptosis, with necroptosis asa last resort (Extended Data 
Fig. 7). Notably, despite the fact that most of our knowledge of RIPK1 
function comes from analyses of TNF signalling, and that TNF has a piv- 
otal role in many inflammatory diseases, patients with CRIA syndrome 
responded to the IL-6 inhibitor tocilizumab but did not respond to TNF 
inhibitors. It will be interesting therefore to determine what role RIPK1 
has in IL-6-mediated inflammation. 
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Methods 


Participant enrolment 

Families were enrolled and evaluated in the Clinical Center at the 
National Institutes of Health under a protocol approved by the Insti- 
tutional Review Board of the National Institute of Diabetes and Digestive 
and Kidney Diseases and the National Institute of Arthritis and Mus- 
culoskeletal and Skin Diseases. Human studies complied with relevant 
ethical regulations and all participants provided written informed 
consent. No statistical methods were used to predetermine sample size. 


Tocilizumab treatment 

P1 was 11 years old at the time of her first intravenous infusion of toci- 
lizumab at a dose of 8 mg kg“. She initially received medication every 
3 weeks but later reduced the frequency to every 4 or 5 weeks because 
of a busy school schedule. On the less frequent dosing, P1 had more 
breakthrough symptoms, mainly tender lymphadenopathy. In 2018, 
when the US FDA approved the use of the injectable form in children 
with juvenile idiopathic arthritis, P1 was switched to 162 mg by subcu- 
taneous injection every 2 weeks and did very well on this regimen. P2, 
P3, P4 and P6 received regular self-administered tocilizumab by 162 mg 
subcutaneous injections starting at every 2 weeks—the standard 
dose and route of administration for adults. The dose frequency for 
P3 was gradually increased to every 6 days. P7 received an initial infusion 
of tocilizumab at 8 mg kg” before being switched to the subcutane- 
ous injectable form (162 mg every 2 weeks) for convenience. On this 
regimen, P7 noted prompt resolution of fevers, abdominal pain 
and joint pain, and gradual normalization of laboratory testing, 
including CRP, ESR, haemaglobin, haematological indices and serum 
iron. 


Exome sequencing 

Exome capture (Illumina TruSeq v2 for family 1, Roche SeqCap EZ 
Exome+UTR for family 2, and IDT xGen Exome Research Panel for 
family 3) and sequencing (Illumina HiSeq 2000, 2500 and NovaSeq 
6000) was performed for all available family members at the National 
Institutes of Health (NIH) Intramural Sequencing Center (NISC) using 
2 x101-,2 x 126-, and 2 x 151-base-pair (bp) paired-end reads. The data 
were analysed as follows: alignment with Novoalign; duplicate mark- 
ing with Picard; re-alignment, re-calibration, and variant calling with 
GATK; and annotation with Annovar. Variants were filtered to select 
those that were nonsynonymous or in splice sites within 6 bp of an 
exon, had less than1% mutantallele frequency in variant databases, and 
co-segregated with the phenotype. The mutations were validated by 
Sanger sequencing in all family members, and to rule out non-paternity, 
non-maternity or other sample identity errors, genders and related- 
ness were confirmed by examining heterozygous call rates on the X 
chromosome, Y chromosome call rates and Mendelian inheritance 
error rates in the exome data. 


In vitro cleavage assays 

Unlabelled in vitro transcription and translation of 1 ug of empty 
pCMV6-Entry control vector (Origene), wild-type RIPK1 cDNA cloned 
into pCMV6-Entry vector (Origene), p.D324N, p.D324H, p.D324Y and 
p.D324A mutant R/PK1 constructs (GENEART Site-Directed Mutagen- 
esis System, Invitrogen) was performed in a 50-pl reaction using the 
TnT T7 Quick Coupled Transcription/Translation System (Promega). 
We incubated 2 pl of this reaction with either 12 U of purified recombi- 
nant caspase-8 (Calbiochem), 12 U of purified recombinant caspase-6 
(Calbiochem), or an equal volume of re-suspension buffer, in caspase 
reaction buffer from the Caspase-8 Fluorometric Assay Kit (Enzo Life 
Sciences) and 10 mM dithiothreitol (DTT) in a 40 pl final volume at 
37 °C for 3h. These reactions were blotted for RIPK1 using an antibody 
recognizing a RIPK1 C-terminal antibody (610459, BD Transduction 
Laboratories). 


RNA sequencing 

Total RNA was isolated from whole blood collected in PAXgene Blood 
RNA Tubes using PAXgene Blood RNA Kit (PreAnalytixX) as per the manu- 
facturer’s instructions. Total RNA was used for cDNA library prepa- 
ration using the TruSeq Stranded mRNA Library Preparation kit for 
NeoPrep (Illumina). Sequencing was performed on an Illumina HiSeq 
3000 System in a1 x 50-bp single-read mode. Sequenced reads were 
mapped against the human reference genome (GRCh38) using hisat 
v.2.2.1.0**. Reads mapped to haemoglobin genes were removed from 
further analysis. Mapped reads were quantified using HTSeq®”*. All the 
count data were normalized using TCC” and differentially expressed 
genes were detected using edgeR**. Gene Ontology enrichment analysis 
was performed using DAVID”. 


Mice 

All mouse studies complied with relevant ethical regulations and 
approved by the Walter and Eliza Hall Institute Animal Ethics Com- 
mittee. The RipkI??™ and Ripk1??%"4 mice were generated by 
the MAGEC laboratory (WEHI) on a C57BL/6J background. To gen- 
erate Ripk1>”™ mice, 20 ng pl"! of Cas9 mRNA, 10 ng pl"! of sgRNA 
(ATTTGACCTGCTCGGAGGTA) and 40 ng ul" of the oligo donor 
(tgtcttctcattacagAAAGAGTATCCAGATCAAAGCCCAGTGCTGCAGAG 
AATGTTT TCACTGCAGCATGCCTGTGTACCAT TACCTCCGAGCAGGTC 
AAATTCAGgtaactcacctattcgttcatttgcatactcgctca) (in which 
uppercase bases denote exons; lowercase bases denote intron 
sequences) were injected into the cytoplasm of fertilized one-cell 
stage embryos generated from wild-type C57BL/6J breeders. To 
generate Ripk1??*°" mice, 20 ng pl of Cas9 mRNA, 10 ng pl 
of sgRNA (TGACAAAGGTGTGATACACA) and 40 ng pl" of oligo 
donor (GGATAATCGTGGAGGCCATAGAAGGCATGTGCTACTTACAT 
GACAAAGGTGTGATACACAAGAACCTGAAGCCTGAGAATATCCTCGTT 
GATCGTGACTTTCACAT TAAGgtaatccacaatctg) were injected into the 
cytoplasm of fertilized one-cell stage embryos generated from Rip- 
k1P2254/03254 Ri nk3 Casp8 " breeders. Twenty-four hours later, two-cell 
stage embryos were transferred into the uteri of pseudo-pregnant 
female mice. Viable offspring were genotyped by next-generation 
sequencing. Targeted animals were backcrossed twice to wild-type 
C57BL/6J to eliminate off-target mutations and to re-integrate Ripk3 
and Casp8 genes into RipkI°2*""4 mice. The Ripk3 mice*’, Casp8 
mice” and MIKI“ mice” were all previously described. The Ripk3“ mice 
were backcrossed to C57BL/6J mice for more than ten generations. 


TLR challenge 

Eight-to-twelve-week-old male mice received intraperitoneal injection 
of either 2 mg kg? LPS or 50 pg poly(I:C). Calculations to determine 
group sizes were not performed, mice were not randomized but were 
grouped according to genotype, and experiments were blinded. 


Cells 

MEFs were isolated from E10.5 embryos and MDFs were isolated from 
mouse tails. After SV40 transformation, MEFs and MDFs were tested 
for mycoplasma. 293T cells (ATCC) used to produce SV40 viruses and 
in Extended Data Fig. 5b were tested for mycoplasma but not authen- 
ticated. 


Time-lapse imaging 

Percentage cell death was assayed every 30-45 min by time-lapse imag- 
ing using the IncuCyte live cell analysis imaging (Essenbioscience) or 
the Opera Phenix High Content Screening System (PerkinElmer) for 
16 hwith 5% CO, and 37 °C climate control. For the IncuCyte and Opera 
Phenix imaging, dead cells were identified by propidium iodide (0.25 pg 
mI") staining, and for the Opera Phenix imaging, all cells were stained 
with 250nM of SiR-DNA (Spirochrome). Dyes were added to the cells 2 
h before imaging and compounds were added 10 min before the start 
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of imaging. For the Opera Phenix imaging, images were analysed using 
the server-based Columbus 2.8.0 software (PerkinElmer) to identify 
nuclei based on SiR-DNA staining and dead cells using propidium iodide 
staining. Results were exported as counts per well to be processed and 
graphed using R Studio (https://www.R-project.org/) with the tidyverse 
package (https://CRAN.R-project.org/package=tidyverse). 


Human and mouse cytokines measurement 

Human serum and PBMC supernatant cytokine content was measured 
by enzyme-linked immunosorbent assay (ELISA) (R&D: SLB50, STAOOC 
and S6050) according to the manufacturer’s instructions. The meas- 
urements were performed in technical duplicates. Student’s t-test was 
performed for the statistical analysis. Mouse serum and BMDMs super- 
natant cytokine content was measured by ELISA (eBioscience for TNF 
and IL-6 and R&D for IL-1) according to the manufacturer’s instructions. 


Human PBMC ex vivo stimulation 

Ficoll-isolated human PBMCs were serum-starved for 20 min and stimu- 
lated for 3 h with LPS (Invivogen, tIrl-3pelps) or 6 h with poly(I:C) (Invi- 
vogen, tlrl-pic). Cytokines were measured by ELISA as described above. 


Reagents 

The SMAC mimetic compound A, the caspase inhibitor IDN-6556 (Idun 
Pharmaceuticals) and the RIPK1 inhibitor necrostatin were synthesized 
by TetraLogic Pharmaceuticals. The RIPK3 inhibitor GSK’872 was from 
Calbiochem. The TAK1 inhibitor (5Z)-7-oxozeaenol, the IKK inhibitor 
IKK-16 and the MK2 inhibitor PF-3644022 were from Tocris Bioscience. 
Cycloheximide was from Sigma. Recombinant Fc-TNF was produced in 
house. Ultrapure LPS-EB and poly(I:C) were purchased from Invivogen. 


Immunostaining 

Embryonic yolk sacs were fixed for 20 min at room temperature in 4% 
paraformaldehyde, blocked and permeabilized in PBS with 2% normal 
donkey serum (Jackson ImmunoResearch, 017-000-121) and 0.6% Triton 
X, probed with primary antibodies, cleaved caspase-3 (9661, CST) and 
PECAMI (AF3628, R&D Systems) at 4 °C overnight, then secondary 
antibodies goat anti-rabbit AF488 (Invitrogen A-11008) and donkey 
anti-goat cy3 (705-165-147, Jackson ImmunoResearch) at room tem- 
perature for 1h. Samples were cleared in a glycerol gradient (5-80%) 
overnight, whole-mounted in 80% glycerol and imaged using a DP72 
microscope and cellSens Standard software (Olympus). 


Immunoprecipitation 

Ten million cells were seeded in 10-cm dishes. After the indicated treat- 
ments, cells were lysed in DISC lysis buffer (ISO mM sodium chloride, 
2mMEDTA, 1% Triton X-100, 10% glycerol, 20 mM Tris, pH 7.5). Proteins 
were immunoprecipitated with 20 pl of protein G Sepharose plus 1.5 pg 
of FADD antibody (clone 7A2, in house) with rotation overnight at 4 °C. 
Beads were washed four times in DISC and samples eluted by boiling 
in 60 pl 1x SDS loading dye. 


Western blotting 

Cells lysates were separated on 4-12% gradient SDS-polyacrylamide 
gels (Biorad), transferred to polyvinylidene fluoride (Millipore) mem- 
branes and blotted with indicated antibodies purchased from CST 
except for phospho-RIPK3 (a gift from Genentech), actin (Sigma) and 
FADD (clone 7A2, in house). In vitro cleavage assays were blotted with 
a with an anti-RIPK1 antibody recognizing the C-terminal part (BD 
Transduction Laboratories, 610459). Cell lysates were blotted with 
an anti-RIPK1 antibody recognizing the N-terminal part (3493, Cell 
Signaling Technology). 


NF-KB assay in patient-derived cells 
NF-KB activation was assessed by measuring nuclear translocation 
of subunit p65 in fibroblasts derived from a single skin biopsy. Cells 


were grown overnight in 96-well plates seeded at 16,000 cells per well, 
and treated for 30 min with TNF (PeproTech) in PBS containing 1 
mM CaCl, and 1mM MgCl, (PBS-CM). Cells were pre-fixed for 5 min 
with 2% paraformaldehyde (PFA) in PBS-CM, then fixed for 10 min 
with 6% PFA in PBS-CM, and aldehyde groups were quenched with 
50 mMNH,Clin PBS-CM for 15 min. After permeabilization with 0.3% 
SDS in PBS-CM for 5 min, cells were incubated with donkey serum 
dilution buffer (DSDB; 16% donkey serum, 0.3% Triton X-100, and 
0.3 M NaCl in PBS) for 30 min, followed by overnight incubation at 
4 °C with rabbit monoclonal NF-KB subunit p65 antibody (8242, Cell 
Signaling Technology) diluted at 1:500 in DSDB. Samples were then 
washed 3 times with permeabilization buffer (0.3% Triton X-100 and 
0.1% BSA in PBS) and incubated with a 1:300 dilution of donkey anti- 
rabbit secondary antibody coupled to Alexa 488 (A21206, Molecular 
Probes) in DSDB for 1h. Nuclei were counter-stained with a 1:2,000 
dilution of SYTO 59 (Thermo Fisher) for 15 min. Automated field 
selection and plate imaging were performed with an IncuCyte Zoom 
incubator-microscopy system (Essen Bioscience) using a 20x objec- 
tive. Nine fields per well of four wells per participant were pooled 
for analysis of nuclear p65 signal intensity. Nuclei were marked in 
red over a phase-contrast image, and p65 immunofluorescence was 
labelled in green. Overlaying a p65 mask ona nuclear mask showed 
both positive and negative nuclei, whereas a yellow co-staining mask 
showed positive nuclei only. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


The original RNA sequencing data are uploaded and available at the 
Gene Expression Omnibus (GEO) under accession GSE127572. All other 
data are available from the corresponding authors upon reasonable 
request. 
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Extended Data Fig. 1| Inflammatory gene signature in P7 whole-blood RNA. expressed genes (false discovery rate < 0.05), with 903 genes upregulated in P7, 
a, MA plot between two P7 samples and two unrelated adolescent healthy and 491 genes downregulated in P7. b, Representative Gene Ontology terms 
controls, both sequenced with technical duplicates. TCC-edgeR package of R associated with immune signalling. 
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Extended Data Fig. 2| Exome reads in family 1. Excerpts of coverage 
histograms and aligned exome sequence reads for the proband and her parents 
in family 1, displayed using the integrative genomics viewer, demonstrate 

de novo occurrence of the c.970G>A (p.D324N) missense mutationinthe LXXD 
caspase-6/8 cleavage motif preceding the cleavage site (arrow). Paternity and 
maternity were confirmed using Mendelian inheritance error rates fromthe 
same exome data. 
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Stage +/+ D325A/+ D325A/D325A Stage Ripk3* = Miki’ Casp8” Casp8&’ Ripk3’Casp8’ Ripk3’Casp8* 
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Extended Data Fig. 3 | ‘Kinase-dead’ RIPK1 or combined loss of Ripk3 and embryos were dead, showing that loss of M/k did not provide any protection. 
Casp8 rescue Ripk1??***2* lethality. a, b, Observed numbers of offspring All Ripk1°?54/"24 Ripnk3’ Casp8 mice were born and developed ALPS owing to 
from Ripk1?”*4” intercrosses and numbers expected from Mendelian ratios at loss of Casp8.c, Kaplan—Meyer survival curves of the indicated genotypes. 
the indicated stage of development. Ripk1??“"**4 mice areE10.5.Allobserved —_d, Cervical lymphnodes (LN), spleen and thymus of 17-week-old mice of the 
E11.5 Ripk1?54/"24 embryos were dead and most of the E10.5 Ripk1??74"9754 indicated genotypes. Pictures are representative of five mice per genotype. 
embryos were abnormal, as described in Fig. 2a, b. Loss of Ripk3 rescued to e, Tissue sections of 18-day-old Ripk1°38"9%54/*, Rip k 19138N.0325A/D138N D325 and 
E12.5; however, 50% of the embryos were abnormal. None of the control mice stained with H&E (left) and anti-CC3 (brown; right). Pictures are 


Ripk1?54°2254 Rink3 mice were born. All observed E11.5 Ripk1?74/92%4 MIKI representative of two mice per genotype. 


a MDFs c MEFs 
NT 7100 Ts TSI TSR TSIR TSN TSIN  TTAKi TIKKi TSMK2i TCHX TTAKIN TIKKiIN TSMK2iN TCHXN NT GE TSI TSR TSIR TTAKi TIKKi TSMK2i TCHX 
Sano co 
os Ripka ene Ripkt 
A 
a [3000 73000 LRAT 
E 
5 ’ ; 
3]. Ripk roses Rijpk 109254 
a fe LS /_| 
é ; 73005 
3000 sas cant iy 
‘ ip 1092500925 
, <a Hpk 
3000 [ies Seen] 
, _ pas 
€ 
venf ten] ten] ton] ton] tom ten] ten ten? vent ton] ton ton ftom [ten] ten Sell egal Ripk 108 
ie} 000 
2 [rem Ripk3 
6 a 
b T3000 
BMDMs : _—_S| A) Ripk 10225a0see4 
735 = ipk3”- 
NT 1100 Ts TsI TSR TSIR TSN TSIN TTAKi =TIKKi TSMK2i TCHX TTAKiN TIKKIN TSMK2iN TCHXN Ripke 
a ‘ 
73000 
9 Ripkt* 
T5000 e 0 ——/ | ——| Ripk 12198N.022sa 
3000 
Ee ae 2 
3 boom ‘ ey aa 
2 43000 
a o 25 AN D138N /D138N 
3 Ripk 103254 , j Me 3501 
7500 73000 la Bip 
d MEFs 
325A gan | e 
cgi at pipkt ; eo 
RipkT _pipk pipk3 Rip MDFs 
NT TS TSi_ NT TS TSI NT TS TSI NT TS TSI Ripk1* Rik 102 Rik 19°25" Rinks 
TS TSI TS _TSI TS TSI 
-RIPK1 75- 
pa Lf ” - 02462460246 246 02 4 6 2 4 Ghours 
ws] 7h mo Oh em ot 0 a i a i] 
wee e ee fF wm ew em RIPK1 50-4 
RIPK1 a “ee mo 
50-| | - - 
ae ee ee a 
37-4 
-—-« -= ~--°- |4-p37 (D325) Ape <p-RIPK3 
mM 504 - 
= 7231, $232 bn es be i iia css 
P-RIPK3 g_| oe + a7-| / 
7231, $232 = — 
50- 
’ r b-p43 504 - 
ad Cl. : te eee ee a lt ere 
Cl. a CASP-8 
CASP-8 25-] 
204 — oo! ~— «p18 
17-4 = pre Cl. _ 
; CASP-3 rn ed as 
Cl 2 Le pte 154 o— 
CASP-3  i5-] + pi7 so] 
ACTIN 7, |e eee seme oom om em om em em em a es ca ed et, 
A ce OT 
ACTIN. ~ 
f MEFs g 
Ripk1* Ripk 195254 Ripk 19825408264 
Ts _TSI Ts _TSI Ts _TSI FADD IP. 
024624602 46246024 6 2 4 6 hours Ripk1** — Ripk1 9264 
100-f seat eee 4. reek ares: 
Ts Tsi_- TS TSI 
2 o- 
RIPK1 = 504 ~ er - “~ 755 
37 - —] 
(ome — nekbe—— a P37 (D325) RIPK1 
254 ee =~ 225 (0301) ap ee ee ee ee oe - 
505 
Cl. | =e eae een -—« 
CASP-8 35 sd » Cl. 
| CASP-8 
20 - — ae + p18 
Cl. a ——_ mee pig FADD 
CASP-3 45] a= — 2 — + P17 
50 
ACTIN a a es eget 
374 


Extended Data Fig. 4 | Ripk1°*?*4” cells are hypersensitive to TNF-induced 
death. a-c, MDFs (a), BMDMs (b) and MEFs (c) of the indicated genotypes were 
treated with either a high dose of TNF (T100; 100 ng mI°) or alow dose of TNF 
(T; 10 ng mI“) combined with SMAC mimetic (S; 100 nM), caspase inhibitor 

(I; 5 pM), RIPK3 inhibitor (R; 11M), necrostatin (N; 10 uM), TAK1 inhibitor (TAKi; 
100 nM), IKK inhibitor (IKKi; 100 nM), MK2 inhibitor (MK2i; 2 1M) or 
cycloheximide (1 pg mI“) for 16 h. Cell death was quantified by propidium 
iodide uptake and time-lapse imaging every 30-45 min using IncuCyte. 
Duplicates are shown for each genotype. Graphs are representative of three 


(MEFs and MDFs) and two (BMDMs) biologically independent cell lines per 
genotype repeated independently. d, MEFs were treated as in Fig. 3d for 2h. 

e, f, MDFs (e) and MEFs (f) were treated as in Fig. 3d for the indicated times. 
Results in d-fare representative of two independent experiments. B-Actin was 
used asa loading control. g, BMDMs were treated with TNF (100 ng mI) 
combined with SMAC mimetic (500 nM) with or without caspase inhibitor 
(5M) for 90 min, and lysates were immunoprecipitated with anti-FADD. 
Results are representative of two independent experiments. For gel source 
data, see Supplementary Fig. 2. 
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Extended Data Fig. 5| Alternative cleavage of RIPK1.a, MEFs were treated for 2h with 700 nM coumermycin to dimerize caspase-8-gyrase. Antibody 
with 10 ng mI“? TNF combined with 500 nMSMAC mimetic for 2h. recognizing the N-terminal end of RIPK1 was used. Results are representative 
b, Doxycycline-inducible caspase-8-gyrase“, wild-type and mutant mouse of four (a) and two (b) independent experiments. For gel source data, see 


RIPK1 constructs or GFP were co-expressed in 293T cells. Cells were treated for Supplementary Fig. 2. 
2hwith1pg mI" doxycycline to induce caspase-8-gyrase expression and then 
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Extended Data Fig. 6| RIPK1 cleavage limits inflammation in an NF-KB- 
independent manner. a, Serum cytokine levels in wild-type and RipkI??*4* 
mice treated for 3h with SO pg of poly(I:C). Each dot represents a mouse. Data 
are mean+s.e.m.,n=3 mice. b, TNF levels inthe supernatant (S/N) of two 
unrelated adolescent controls (Ctl RIPKI“*) and P7 RIPK1?** PBMCs treated 
for3hwith5 pg mI“ poly(I:C). Data are mean of triplicates. c, Body temperature 
of mice of the indicated genotypes after injection of 2mgkg™LPS. Each line 
represent a mouse; n=5 mice per genotype. d, BMDMs of the indicated 
genotypes were treated for 24 h with 25 ng mI" LPS or with 2.5 pg mI poly(I:C). 
Cell death was quantified by propidium iodide staining and flowcytometry. 
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Each dot represents a biological repeat. Graph shows mean; n=1 for Ripk1’ and 
n=2 for Ripk1°”"*, e, f, BMDMs (e) and MDFs (f) were treated with100 ng mI? 
of TNF for the indicated times. Results are representative of two independent 
experiments. B-Actin was used as aloading control. For gel source data, see 
Supplementary Fig. 2.g, NF-kB activation in fibroblasts derived from patient 
skin biopsies was assessed by measuring nuclear translocation of subunit p65. 
Each dot represents the median of more than 1,000 single-cell measurements 
of nuclear mean p65 fluorescent intensities for one individual subject. Data are 
meants.d.,n=4 patients and 4 controls. Pvalues determined by unpaired one- 
tailed (a) or unpaired two-tailed (g) t-tests. 
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Extended Data Fig. 7| See next page for caption. 
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Apoptosis 

- Stabilisation of complex 2 

- CFLIP & RIPK1 iPTMs intact 
- FRIPK1 levels recruit and 
activate caspase-8 

- RIPK3 potentially cleaved to 
inhibit necroptosis 


Mixed Apoptosis 
and necroptosis 

- Stabilisation of complex 2 

- CFLIP & RIPK1 iPTMs intact 
- TRIPK1 recruits and 
activates caspase-8 

- RIPK3 cleavage is not 
significant 


Increased cell death 

Increased levels of RIPK1 recruit more caspase-8, however this appears to be 
RIPK3 dependent (kinase independent). RIPK3 is potentially cleaved, however 
this does not appear to affect caspase-8 activation. RIPK1 and RIPK3 become 
auto-phosphorylated, however neither event appears to be essential to increase 
caspase-8 activity. 


Proposed model for Ripk7°3254 cells 


Increased sensitivity to TNF induced cell death 

As in the homozygote case but caspase-8 cleaves wt RIPK1. The increased 
sensitivity to TNF induced death may cause the hyper-inflammatory response 
observed in the CRIA patients. 


Extended Data Fig. 7| Proposed model for RIPK1(D325A)-induced cell death. 


Left, TNF binding to TNFR1 triggers the formation of complex I, and 
subsequent ubiquitylation and phosphorylation of RIPK1. These post- 
translational modifications (PTMs) inhibit the cytotoxic activity of RIPK1. 
Complex! formation activates NF-KB- and MAPK-dependent survival genes 
suchas CFLAR, which encodes cFLIP. Subsequently, acytosolic complex II 
containing FADD, caspase-8, RIPK1and cFLIP is formed. In this complex, cFLIP 
inhibits caspase-8 activity so that a restricted number of substrates (suchas 
RIPK1) are cleaved, but others (such as pro-caspase-3) are not. Cleavage of 
RIPK1dismantles complex II. Activation of the NF-KB and MAPK signalling 
pathways PTM of RIPK1 prevent TNF from inducing cell death, resulting in cell 
survival (top left). Inhibition of the NF-«B or MAPK signalling pathways reduces 
levels of cFLIP and accelerates formation of complex II, resulting in cell death 
via apoptosis (middle left). When NF-KB or MAPK signalling is disrupted in 
caspase-8-deficient conditions, RIPK1is not cleaved and autophosphorylates, 
which triggers the recruitment of RIPK3 and its autophosphorylation. RIPK3 
phosphorylates MLKL and necroptosis occurs (bottom left). Right, according 
to this model, lack of RIPK1 cleavage could result in several distinct outcomes, 
as follows. (1) RIPK1 accumulation could stabilize complex II, and the presence 
of cFLIP and inhibitory PTMs to RIPK1 may prevent caspase-8 from killing, 
resulting in cell survival. (2) The accumulation of ‘uncleavable’ RIPK1to 
complex II could override the inhibitory RIPKIPTMs, resulting in 
autophosphorylation of RIPK1and recruitment of RIPK3, leading to 
necroptosis. (3) RIPK1 accumulation could result in activated caspase-8 that 


cleaves RIPK3, resulting in cell survival. (4) Stabilization of complex II could 
result in recruitment and activation of caspase-8 that induces apoptosis and 
possibly prevents necroptosis by cleaving RIPK3. (5) Finally, the accumulation 
of RIPK1 could result in activation of both RIPK3 and caspase-8 and therefore 
induce both apoptotic and necroptotic cell death. In terms of how these 
potential outcomes match with our data, in homozygote Ripk1>™ cells, both 
caspase-8 and RIPK3 are activated after TNF signalling, which suggests that 
apoptosis and necroptosis occur at the same time (Figs. 2d, 3a, b). However, 
according to these models, loss of RIPK3 limits caspase-8 activation (Fig. 3a, b). 
This suggests that the recruitment of RIPK3 to complex Il increases the 
recruitment and activation of caspase-8. A precedent for this observation 
comes from experiments in which RIPK3 inhibitors promoted RIPK1- 
dependent caspase-8 activation”**, ina manner we term ‘reverse activation’. In 
our experiments, however, RIPK3 activation occurs downstream of TNF 
signalling, which suggests that reverse activation might representa 
physiological amplification loop that increases caspase-8 activation. Yet, this 
requirement for RIPK3 is not present in all cells, as the embryonic lethality of 
the RIPK1-cleavage mutant is only partially rescued by loss of Ripk3. Inthe 
heterozygote Ripk1>”™ cells, caspase-8 cleaves wild-type RIPK1, thus limiting 
TNF-induced cell death as compared to homozygote cells. However, reduction 
of cFLIP and/or RIPK1PTMs by treatment with IAP, TAK1, IKK or translational 
inhibitors decreases the threshold of TNF sensitivity (Extended Data Fig. 4). 
This may cause the hyper-inflammatory response observed in patients with 
CRIA syndrome (Fig. 1). 
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Extended Data Table 1| Leukocyte surface markers in patients with CRIA syndrome 


Affected subjects 


Age at evaluation 


Percentage of leukocytes range 


Percentage of lymphocytes | Surface markers 
Total T CD3+ [60.0-83.7] 


Total helper T CD3+CD4+ [31.9-62.2] 
Helper T, naive CD3+CD4+CD62L+CD45RA+ [7.6-37.7] 
Helper T, central memory CD3+CD4+CD62L+CD45RA- [10.4-30.7] 


Helper T, effector memory CD3+CD4+CD62L-CD45RA- [2.3-15.6] 
Helper T, TEMRA CD3+CD4+CD62L-CD45RA+ [0.0-1.5] 
Total cytotoxic T CD3+CD8+ [11.2-34.8] 
Cytotoxic T, naive CD3+CD8+CD62L+CD45RA+ [5.7-19.7] 
Cytotoxic T, central memory | CD3+CD8+CD62L+CD45RA- [1.5-10.3] 
Cytotoxic T, effector memory | CD38+CD8+CD62L-CD45RA- [1.1-9.2] 


Double negative T, aB CD3+CD4-CD8- [0.3-1.3] 


Percentages before tocilizumab treatment are shown. Values above or below reference ranges are marked by carets (*) or asterisks (*), respectively. 
NA, not applicable; TEMRA, T effector memory re-expressing CD45RA. 


Extended Data Table 2 | Effect of tocilizumab treatment and RIPK1 caspase cleavage site mutations is absent in known 
autoinflammatory diseases 


a 


es [ 
Ce 


a 
a 
a 


a, Inflammatory markers in subjects treated with tocilizumab. The first time point for each subject is from 3 days before the first tocilizumab injection. P3 had two measurements from the same 
week at his 10-month post-tocilizumab evaluation. Reference ranges are given in brackets. b, Variant databases in which mutations in the RIPK1 caspase cleavage site are absent. Variant data- 
bases are not independent. 

c, Result of additional screening for mutations in the RIPK1 caspase cleavage site. 

ALPS, autoimmune lymphoproliferative syndrome; NHGRI, National Human Genome Research Institute; NHLBI, National Heart, Lung, and Blood Institute; NIEHS, National Institute of Environ- 
mental Health Sciences. 
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Extended Data Table 3 | Conservation of RIPK1 caspase-8 cleavage site 
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Protein sequences orthologous to human RIPK1 were aligned in 235 vertebrate species, using Multiz alignment in the UCSC Genome Browser. These include representative species from the 
major classes: 51 fish, 3 amphibians (A.), 14 reptiles (Rept.), 58 birds (Aves) and 109 mammals (Mammalia). Most species within these classes, except fish (7 out of 51), contain the very highly 
conserved D324 (human numbering) caspase cleavage site within this region. Notably, nearly all species (223) have a potential caspase cleavage site, D300; however, it is noteworthy that this 
Asp is in most cases succeeded by a large hydrophobic amino acid that is less favourable for caspase cleavage. 
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in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist. 


Statistics 
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 
n/a | Confirmed 


[] X] The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 
[] 4 A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


O X The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


[X]|[_] A description of all covariates tested 
Xx C] A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


oO A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


x Oo For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


[X]|{_] For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 
i] [] For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 
[X]|[_] Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 

Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection Exome Sequencing was performed on an Illumina HiSeq 2000, 2500 and NovaSeq 6000). RNA Sequencing was performed on an Illumina 
HiSeq 3000 System. 
Cell death was monitored by timeapse imaging using the IncuCyteB live cell analysis imaging (Essenbioscience) and the Opera PhenixTM 
High Content Screening System (PerkinElmer, USA). 


Data analysis Exome Sequencing was analyzed as follows: alignment with Novoalign; duplicate marking with Picard; re-alignment, recalibration, and 

variant calling with GATK; and annotation with Annovar. 
RNA Sequenced reads were mapped against the human reference genome (GRCh38) using hisat v2.2.1.035. Reads mapped to 
hemoglobin genes were removed from further analysis. Mapped reads were quantified using HTSeq36,37. All the count data were 
normalized using TCC38 and differentially expressed genes were detected using edgeR39. Gene ontology enrichment analysis was 
performed using DAVID37. The original RNA sequencing data is uploaded and aveilable online (Gene Expression Omnibus: GSE127572). 
For the Opera PhenixTM, images were analysed using the server based Columbus 2.8.0 software (PerkinElmer, USA) to identify nudei 
based on SiR-DNA staining and dead cells using P| staining. Results were exported as counts per well to be processed and graphed using R 
Studio (https://www.R-project.org/) with the tidyverse package (https://CRAN.R-project.org/package=tidyverse). 

For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 

We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 
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Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


The original RNA sequencing data is uploaded and available online (Gene Expression Omnibus: GSE127572). 


Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


xX Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size No sample size calculations were performed. For the in vitro experiments the variability between biological repeats was very low, when 
possible at least 3 independent biological cell lines were analysed at least twice. For in vivo experiments, for each experiments at least 3 to 5 
animals per genotype were used and experiments were performed twice to ensure reproducibility. 


Data exclusions No data were excluded from the study. 
Replication Experiments were reproduced at least twice and all attempts at replication were successful. 
Randomization | Mice were grouped according to genotype and animals were age- and sex-matched. 


Blinding Animal technicians were blinded to treatment conditions and temparture and body temperature measurements without any input from the 
experimental investigator. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 


Animals and other organisms 


Human research participants 


Clinical data 


Antibodies 


Antibodies used PK1 N-terminal antibody (clone D94C12, cat number 3493, Cell Signaling Technology) 
PK1 C-terminal antibody (cat number 610459, BD Transduction Laboratories) 
hospho-RIPK1 (clone D1L3S, cat number 65746, Cell Signaling Technology) 
hospho -RIPK3 (Gift from Genetech) 

aspase-6 (clone EPR4405, cat number ab108335, Abcam) 

aspase-8 (clone E7, cat number ab32397, Abcam) 

eaved caspase-3 (cat number 9661, Cell Signaling Technology) 

eaved caspase-8 (clone D5B2, cat number 8592, Cell Signaling Technology) 
PECAM1 (cat number AF3628, R&D Systems) 

goat anti-rabbit AF488 (cat number A-11008, Invitrogen) 

donkey anti-goat cy3 (cat number 705-165-147, Jackson ImmunoResearch) 
FADD (clone 7A2, WEHI in house) 
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IkBa (cat number 9242, Cell Signaling Technology) 

phospho-p65 (clone 93H1, cat number 3033, Cell Signaling Technology) 
p65 (clone D14E12, cat number 8242, Cell Signaling Technology) 
phospho-JNK1/2 (clone cat number 4668P, Cell Signaling Technology) 
phospho-p38 (clone D3F9, cat number 4511, Cell Signaling Technology) 
phospho ERK1/2 (cat number 9101 Cell Signaling Technology) 

B-actin (clone AC-15, cat number A-1978; Sigma-Aldrich) 


Validation Validation data for commercial antibodies are available on vendor websites. 
Validation of p-RIPK3 has been done on RIPK3 knock-out cells (Figure 3e). GEN135-35-9 anti-mouse phospho-RIPK3 T1231, S232 
is validated for WB and IHC in Newton et al (2016) Nature 540:129-133. 
Validation of anti FADD was with FADD knock-out cells in O'Reilly et al (2004) Cell Death Differ 11:724-736 


Eukaryotic cell lines 


Policy information about cell lines 


= 
fev) 
a 
‘= 
= 
o 
= 
o 
Wn 
o 
fev) 
= 
im 
= 
= 
o 
©) 
oO 
a 
= 
a 
Nn 
S 
=} 
5 
fev) 
5 
< 


Cell line source(s) 293T were from ATCC. All mouse cell line were generated from the different mice in this study. 
Authentication Mouse cell lines were sequenced to confirm the RIPK1 D325A genotyping. 293T were not authenticated. 
Mycoplasma contamination 293T and most of mouse cell lines were tested and negative for mycoplasma 


Commonly misidentified lines No commonly misidentified line was used 
(See ICLAC register) 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals All mice are Mus musculus maintained on a C57BL/6 background. Litter-mates males of 8-12 weeks old were used for Fig 4a,e 
and Extended Data Fig6a. Litter-mates females of 8-12 weeks old were used for Fig 4d. Mice of both sexes of 8-12 weeks old 
were used for timed matings and to generate MDFs and BMDMs. Litter-mates mice of both sexes were monitor for enlarged 
lymph nodes and spleen until ethical point (extended data fig3c, d). Litter-mates mice of both sexes of 2 weeks old were used for 
HE and caspase-3 staining in Extended Data fig3e 


Wild animals The study did not involve wild animals. 
Field-collected samples The study did not involve samples collected from the field. 
Ethics oversight All mouse experiments were performed according to the guidelines of the animal ethics committee of WEHI 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics Patient 1 Female 10 yrs 

Patient 2 Female 82 yrs 

Patient 3 Male 55 yrs 

Patient 4 Female 54 yrs 

Patient 5 Male 22 yrs 

Patient 6 Female 20 yrs. 

Patient 7 Male 13 yrs. 

All had Recurrent fevers. For more information please see Table 1. 


Recruitment Families were enrolled and evaluated in the Clinical Center at the National Institutes of Health under a protocol approved by the 
nstitutional Review Board of the National Institute of Diabetes and Digestive and Kidney Diseases and the National Institute of 
Arthritis and Musculoskeletal and Skin Diseases. All subjects provided written informed consent. Patients with unexplained 
recurrent fevers were recruited. 


Ethics oversight All experiments in human samples were performed according to the guidelines of the human ethics committee of the NIH. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 
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A dominant autoinflammatory disease 
caused by non-cleavable variants of RIPK1 
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Activation of RIPK1 controls TNF-mediated apoptosis, necroptosis and inflammatory 
pathways!. Cleavage of human and mouse RIPK1 after residues D324 and D325, 
respectively, by caspase-8 separates the RIPK1 kinase domain from the intermediate 
and death domains. The D325A mutation in mouse RIPK1 leads to embryonic lethality 
during mouse development*?. However, the functional importance of blocking 
caspase-8-mediated cleavage of RIPK1 on RIPK1 activation in humans is unknown. 
Here we identify two families with variants in RIPK1 (D324V and D324H) that lead to 
distinct symptoms of recurrent fevers and lymphadenopathy in an autosomal- 
dominant manner. Impaired cleavage of RIPK1 D324 variants by caspase-8 sensitized 
patients’ peripheral blood mononuclear cells to RIPK1 activation, apoptosis and 
necroptosis induced by TNF. The patients showed strong RIPK1-dependent activation 
of inflammatory signalling pathways and overproduction of inflammatory cytokines 
and chemokines compared with unaffected controls. Furthermore, we show that 
expression of the RIPK1 mutants D325V or D325H in mouse embryonic fibroblasts 
confers not only increased sensitivity to RIPK1 activation-mediated apoptosis and 
necroptosis, but also induction of pro-inflammatory cytokines such as IL-6 and TNF. 
By contrast, patient-derived fibroblasts showed reduced expression of RIPK1 and 
downregulated production of reactive oxygen species, resulting in resistance to 
necroptosis and ferroptosis. Together, these data suggest that human non-cleavable 
RIPK1 variants promote activation of RIPK1, and lead to an autoinflammatory disease 
characterized by hypersensitivity to apoptosis and necroptosis and increased 
inflammatory response in peripheral blood mononuclear cells, as well asa 
compensatory mechanism to protect against several pro-death stimuli in fibroblasts. 


RIPK1is a key mediator of apoptotic and necrotic cell death as well as 
inflammatory pathways’. Activation of RIPK1 promotes several cell 
death responses, including apoptosis and necroptosis, downstream 
of TNFR1. Caspase-8-mediated cleavage after Asp324 in human RIPK1 
(or Asp325 in mouse RIPK1) separates the kinase domain in the N-termi- 
nal part of RIPK1 from its intermediate and death domains. The death 
domain is involved in mediating the activation of the N-terminal kinase 
by dimerization’*”. The D324A variant in human RIPK1 blocks cleavage 
by caspase-8°. Homozygous D325A mutation in mouse RIPK1 sensitizes 


cells to both apoptosis and necroptosis induced by TNF and leads to 
embryonic lethality. The early demise of RipkI??“"2"4 mice can be 
rescued by simultaneous deletion of Ripk3 and Fadd’, or Mikland Fadd’, 
but not of either gene alone. However, the functional importance of 
caspase-8-mediated cleavage of RIPK1in humans is unknown. Here, we 
identified a human autoinflammatory disease caused by non-cleavable 
RIPK1 variants with mutations at D324. We show that disrupted cleav- 
age of RIPK1 by caspase-8 in humans leads to a dominantly inherited 
condition by promoting the activation of RIPK1. 
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Fig. 1| Heterozygous variants at the RIPK1 cleavage site cause 
autoinflammatory disease in humans. a, Pedigrees of two families with 
variants in RIPK1 at the caspase-8 cleavage site. b, Timeline of recurrent fever 
episodes in P1 over 4 months. Red dots denote increased temperatures during 
fever episodes. Blue boxes denote normal temperatures between flares. 

c, Computerized tomography scans of P1 (top) and sonographic image of P3 
(bottom) show lymphadenopathy (arrows) and splenomegaly, respectively. 


Patients with RIPK1 non-cleavable variants 


The first patient (P1) is atwo-year-old Chinese boy. His symptoms began 
at two months of age with periodic fever episodes occurring every eight 
to ten days and lasting for three to five days (Fig. 1a, b, Extended Data 
Table 1). His fevers were associated with increased levels of C-reactive 
protein and white blood cell counts, but no other accompanying symp- 
toms. He developed lymphadenopathy at two years of age (Fig. Ic). 
The patient did not havea skin rash, arthritis, arthralgia or hepatos- 
plenomegaly. Lymphocyte phenotyping revealed increased counts of 
both double-negative T cells and naive B cells (Extended Data Table 2). 

The second family is of European Canadian ancestry. The proband 
(P2) is a 35-year-old female who experienced recurrent fevers from 
six months of age, and developed intermittent lymphadenopathy, 
hepatosplenomegaly and microcytic anaemia. Three of her four sons 
are affected. Her eldest (P3, 14 years of age) and youngest (P5, 10 years 
of age) sons have a similar history of recurrent fevers, intermittent 
lymphadenopathy, splenomegaly and microcytic anaemia. Her sec- 
ond son (P4, 12 years of age) has microcytic anaemia but no history of 
recurrent fevers (Fig. 1a, c, Extended Data Table 1). 

Whole-exome sequencing (WES) of Pland his parents revealed that P1 
has a heterozygous de novo D324V mutation in RIPK1 (Fig. la, Extended 
Data Fig. la—c). For the second family, WES identified a single variant, 
D324H in RIPK1, which is de novo in the mother and inherited by her 
three affected sons (Fig. la, Extended Data Fig. 1c). No other muta- 
tions—including rare variants of unknown importance in genes known 
to cause periodic fever or autoinflammatory syndromes—were found 
(Supplementary Tables 1, 2). Copy number variant analysis based on 
WES data for the first family, and microarray analysis for the second 
family, did not identify any copy number variants among affected indi- 
viduals. Both variants affected the caspase-8 cleavage site, D324, which 
is highly conserved in RIPK1 across species (Extended Data Fig. 1d, e). 
These variants were not reported in any public database of human 
exomes and were predicted to be deleterious (combined annotation- 
dependent depletion (CADD) score > 20) for protein function by com- 
putational in silico modelling (Extended Data Fig. 1f). 

Expression of wild-type and mutant RIPK1in HEK293T cells indicated 
that variants at residue D324, including D324V and D324H, blocked the 
cleavage of RIPK1 by caspase-8 (Extended Data Fig. 1g).D325A mutation 
in mouse RIPK1 (the equivalent residue for D324 in human RIPK1) did 
not affect its turnover (Extended Data Fig. 1h), or block its interaction 
with other proteins suchas binding with caspase-8 into the FADDosome 
complex (Extended Data Fig. 1i). The variants in D324 resulted in non- 
cleavable RIPK1, which was directly demonstrated by incubating mutant 
RIPK1 generated by TNT cell-free protein expression with recombinant 
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caspase-8 (Extended Data Fig. 1j). The inhibitory effect of the D324V 
variant on the cleavage of RIPK1 by caspase-8 was further confirmed 
in patient P1 fibroblasts after stimulation by TNF and cycloheximide 
(CHX) (Extended Data Fig. 1k). 


Activation of inflammatory signalling in the patients 


We detected markedly increased production of pro-inflammatory 
cytokines and chemokines such as IL-6, TNF and IFNy, and anti-inflam- 
matory cytokines such as IL-10 in serum from patients by cytometric 
bead array (Fig. 2a) or enzyme-linked immunosorbent assay (ELISA) 
(Extended Data Table 3). Serial sampling from P1 showed that activation 
of inflammatory responses was even more notable during fever epi- 
sodes (Fig. 2a). Increased expression of IL-6, TNF and IL-8 in monocytes 
and IL-6in T cells from Pl after stimulation by lipopolysaccharide (LPS) 
was detected by intracellular cytokine staining (Extended Data Fig. 2a, 
b). Moreover, phosphorylation of STAT3, the downstream marker of IL-6 
signalling, was upregulated during fever episodes in patient monocytes 
at basal level when compared with unaffected controls (Extended Data 
Fig. 2c). We also observed increased phosphorylation of MAPK p38 in 
patient monocytes, Bcells and T cells after LPS stimulation (Extended 
Data Fig. 2c). 

To study the transcriptional changes related to non-cleavable RIPK1 
further, we performed single-cell RNA sequencing in patient peripheral 
blood mononuclear cells (PBMCs). The patient had a higher percent- 
age of monocytes compared with an age- and sex-matched unaffected 
control (Fig. 2b, Extended Data Fig. 2d). We observed strong signals 
in both NF-KB and type-I IFN inflammatory pathways in the patient 
monocytes (Extended Data Fig. 2e, f). The patient monocytes highly 
expressed pro-inflammatory cytokines and chemokines, including 
IL8, IL1B and CCL3 (Fig. 2c, Extended Data Figs. 2g, 3a, b). In addition, 
RNA sequencing in PBMCs implicated different gene expression pat- 
ternsin cell death pathways that include increased expression of RIPK3 
and MLKL, suggesting increased levels of necroptosis machinery in 
the patient PBMCs (Fig. 2d). Quantitative PCR (qPCR) confirmed the 
increased expression of IL6, TNF, IL8, LIB, CXCL2 and CXCL3 in patient 
PBMCs (Fig. 2e). Supporting a pathogenic role of excessive IL-6 produc- 
tion, P1 experienced clinical improvement and the PBMCs displayed 
normalized expression of inflammatory mediators after treatment 
with tocilizumab (monoclonal antibody against IL-6R) (Extended Data 
Fig. 4a). 


Increased cell death and inflammatory response 


We examined the response of patient PBMCs to TNF by measuring cell 
survival with the CellTiter-Glo assay, and quantified cell death by meas- 
uring the plasma membrane permeability with ToxiLight assay (Fig. 3a, 
Extended Data Fig. 4b). The PBMCs from patients P1, P2 and P3 showed 
increased sensitivity to both apoptosis induced by co-treatment with 
TNF and apoptosis-inducing SMAC mimetic SM-164, and necroptosis 
induced by co-treatment with SM-164 and the pan-caspase inhibitor 
Z-VAD-FMK (carbobenzoxy-valyl-alanyl-aspartyl-[O-methyl]-fluorome- 
thylketone) compared with PBMCs from unaffected controls. Co-treat- 
ment was required as treatment with these compounds individually 
did not elicit cell death. Furthermore, both apoptosis and necroptosis 
of patient PBMCs were effectively suppressed by the RIPK1 inhibitor 
necrostatin-1s (Nec-1s) (Fig. 3a, Extended Data Fig. 4b). We found that 
levels of RIPK1 phosphorylated at S166 (p-S166-RIPK1)—a marker for the 
activation of RIPK1’*—were increased in the patient PBMCs treated with 
various combinations of these compound known to activate apoptosis 
or necroptosis, compared to that of controls (Fig. 3b, Extended Data 
Fig. 4c), which suggests that blocking the cleavage of RIPK1 sensitizes 
the activation of its kinase activity. We also found increased levels of 
p-S358-MLKL—a biomarker for necroptosis°’—in the patient PBMCs 
treated with SM-164 plus Z-VAD-FMK compared with that of control 
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Fig. 2| Strong activation of inflammatory signalling in patient P1. a, Serum 
levels of cytokines IL-6, TNF and IL-10 from patient P1 (red denotes serum 
during fever episodes; blue denotes serum during remission) determined by 
cytometric bead array. b, Left, uniform manifold approximation and projection 
(UMAP) of 18,928 cells, split between patient P1 and an age- and sex-matched 
unaffected control (C1) after alignment. Right, UMAP visualization and marker- 
based annotation of 16 cell subtypes, coloured by cluster identity. The patient 
P1 displayed higher percentage of monocytes (red frame). Eryth, erythrocytes; 
Mk, megakaryocyte; NK, natural killer cell; pDC, plasmacytoid dendritic cell. 

c, Visualization of expression of /L8 and /L1B (coloured single cells) on UMAP 
plot projecting PBMCs from patient P1 (n =7,936 cells) and an age- and sex- 
matched unaffected control (n=10,992 cells). d, RNA-sequencing analysis of 
cell death, NF-KB and typeI IFN pathways in patient PBMCs compared with 
three paediatric unaffected controls (C1-C3). Analysis of each sample was 
performed in duplicate. For gene names, see Supplementary Fig. 2.e, qPCR 
analysis of cytokine and chemokine-related genes in PBMCs from P1 compared 
with five paediatric unaffected controls (C). Data are mean +s.e.m. Circles 
correspond to each tested individual. Analysis of each sample was performed 
in triplicate. The PBMCs from patient P1lin b-e were obtained during fever 
episodes. 


cells (Fig. 3b). Notably, patient PBMCs treated with TNF and SM-164 
showed increased levels of not only cleaved caspase-3, but also p-S358- 
MLKL, which were both effectively reduced by treatment with Nec-1s 
(Fig. 3b). These results suggest that the non-cleavable RIPK1 variant 
sensitized the patient PBMCs to both necroptosis and apoptosis ina 
RIPK1-dependent manner. 

Release of cyclophilin A is a biomarker for necroptosis in cell-based 
assays and has also been implicated as a potential biomarker inhuman 
diseases”, We detected the presence of cyclophilin Ainaurine sample 


froma patient during a fever episode but not in remission, which pro- 
vides evidence for enhanced necrotic cell death in the setting of inflam- 
mation in vivo (Fig. 3c). These findings indicate that the non-cleavable 
RIPK1 variant may promote the activation of RIPK1, which leads to 
necrotic cell death in vivo. 

Activation of necroptosis promotes a strong inflammatory response 
suchas the production of pro-inflammatory cytokines”. Compared to 
that of control PBMCs, the patient PBMCs stimulated with TNF plus 
SM-164 showed an exacerbated inflammatory response, which was 
effectively inhibited by the RIPK1 inhibitor Nec-1s (Fig. 3d). Confirming 
the involvement of RIPK1 kinase activity in promoting the inflamma- 
tory responses, we found that the increased /L6 expression owing to 
cell death induced by TNF plus SM-164 stimulation in patient PBMCs 
was suppressed by Nec-1s (Fig. 3e). 

The patient data raised the possibility that non-cleavable RIPK1 
variants function directly in promoting its own activation, which in 
turn mediates apoptosis and necroptosis in a signal-dependent man- 
ner. To test this possibility experimentally, we expressed the cleav- 
age site D325V and D325H RIPK1 mutants in Ripk1I-knockout mouse 
embryonic fibroblasts (MEFs). Compared to that of RipkI-knockout 
MEFs and Ripk1-knockout MEFs complemented with wild-type RIPK1, 
MEFs expressing the D325V or D325H RIPK1 mutant were consistently 
hypersensitive to cell death induced by TNF, which was inhibited by 
the addition of Nec-1s. The enhanced cell death could also be blocked 
by introducing a RIPK1 kinase inactivation mutation, D138N, in cis with 
D325V or D325H, providing direct evidence for the role of RIPK1 kinase 
activity in promoting cell death (Fig. 3f, Extended Data Fig. 5a, b). Simi- 
lar to patient PBMCs, MEFs expressing D325V or D325H mutant RIPK1 
stimulated by TNF alone or TNF plus SM-164 showed increased levels of 
p-S166-RIPK1 (Fig. 3g). By contrast, stimulation of RipkI-knockout MEFs 
complemented with wild-type RIPK1 with TNF alone was not sufficient 
to promote the activation of RIPK1 (Fig. 3g). These data support the 
hypothesis that the non-cleavable variants of RIPK1 directly promote 
the activation of RIPK1. 

Similar to that of patient PBMCs, RIPK1(D325V)- or RIPK1(D325H)- 
complemented Ripk1-knockout MEFs stimulated by TNF or TNF plus 
SM-164 showed increased levels of cleaved caspase-3 compared to that 
of wild-type-complemented MEFs, which was inhibited by Nec-1s and 
by the kinase inactivation mutation D138N in cis with D325V or D325H 
construct (Fig. 3g, h, Extended Data Fig. 5c). Also similar to that of 
patient PBMCs, the stimulation of RipkI-knockout MEFs expressing 
D325V or D325H RIPK1 mutant with TNF alone or TNF plus SM-164 
induced increased levels of p-S345-MLKL (Fig. 3g, Extended Data 
Fig. 5c). By contrast and as expected, stimulation of RipkI-knockout 
MEFs or Ripk1-knockout MEFs complemented with wild-type RIPK1 
with TNF alone or TNF plus SM-164 was not sufficient to promote 
the activation of necroptosis and appearance of p-S345-MLKL. TNF- 
induced cell death and the appearance of p-S345-MLKL in D325V- or 
D325H-complemented Ripk1-knockout MEFs were both blocked by 
Nec-1s and by the inactivation D138N mutation in cis with D325V or 
D325H (Fig. 3f-h, Extended Data Fig. 5c). These results suggest that 
D325V and D325H are gain-of-function mutations in RIPK1 that pro- 
mote the activation of its kinase, which in turn mediates apoptosis 
and necroptosis. 

Because the expression of non-cleavable RIPK1 promotes both 
apoptosis and necroptosis, we next determined whether these two 
forms of cell death might be independent of each other by examining 
RipkP??94"254Rink3 MEFs from Ripk???4" knock-in mice crossed 
with necroptosis-deficient Ripk3” mice’. Notably, we found that 
Ripk1??°454Rink3- MEFs remained sensitized to apoptosis induced 
by TNF alone and TNF plus SM-164 and showed increased levels of 
p-S166-RIPK1 and cleaved caspase-3, which are both inhibited by 
Nec-Is (Fig. 3i, j, Extended Data Fig. 5d, e). Thus, the activated RIPK1in 
cells expressing non-cleavable RIPK1is able to drive RIPK1-dependent 
apoptosis, independently of necroptosis. 
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Fig. 3| RIPK1 cleavage site variants promote cell death and inflammatory 
response induced by TNF in patient PBMCs and MEFs. a, Cell viability (as 
measured by CellTiter-Glo assay) of PBMCs from patients and eight paediatric 
unaffected controls after treatment as indicated for 24 h. N, Nec-1s;S, SM-164; 
T, TNF; Z, Z-VAD-FMK. Data are mean¢+s.e.m. Circles indicate one sample from 
each individual (P1 was sampled three times). Analysis of each sample was 
performed in triplicate. b, Western blots of PBMCs from patient Planda 
paediatric unaffected control after treatment as indicated for 24h. -, 
untreated; cl, cleaved. For gel source data, see Supplementary Fig. 1. Results are 
representative of two independent experiments. c, Western blots of urine 
samples from P1 during a fever episode (red) and remission (blue) and three 
paediatric unaffected controls. Supernatant (sup.) of fibroblasts from an 
unaffected control stimulated with TNF, SM-164 and Z-VAD-FMK (TSZ) served 
asa positive control. For gel source data, see Supplementary Fig. 1. Results are 
representative of three independent experiments. d, NanoString analysis of 
PBMCs from patient Pland three paediatric unaffected controls after 
stimulation as indicated. For gene names, see Supplementary Fig. 2.e, qPCR 
analysis of /L6 mRNA levels of PBMCs from patient P1 and four unaffected 
controls treated as indicated. Data are mean+s.e.m. Circles correspond to 
each tested individual. Analysis of each sample was performed in triplicate. 
The PBMCsina, b, d and ewere obtained during remission. f, Cell viability of 
Ripk1-knockout MEFs complemented with: GFP; wild-type (WT) RIPK1; D325V, 


We also examined the effect of non-cleavable RIPK1 on ligands of 
other death receptors suchas TRAIL”. We found that the cells express- 
ing the non-cleavable mutant RIPK1 also showed increased sensitivity 
to TRAIL-induced cell death, which could be rescued by the addition 
of Nec-1s (Extended Data Fig. 5f). Levels of p-S166-RIPK1 were also 
increased after TRAIL stimulation in mutant RIPK1-complemented 
MEFs compared to wild-type RIPK1-complemented MEFs (Extended 
Data Fig. 5g). These data further illustrated that the non-cleavable 
mutations in RIPK1 increase RIPK1 kinase activity and sensitize the 
cells to cell death after stimulation by several stimuli. 

We next characterized the effect of non-cleavable RIPK1 on 
cytokine production. NanoString analysis of the patient PBMCs 
stimulated with TNF alone exhibited upregulated gene expression 
in the inflammatory pathway, including /L6, which was reduced by 
Nec-1s (Extended Data Fig. 4d). Because patient P1 responded well to 
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D325H or D138N mutant; or D1I38N/D325H or D138N/D325V double mutants, 
treated as indicated for 24h. Dataare mean +s.e.m.,n=3. Circles correspond to 
each independent experiment. Pvalues determined by unpaired two-tailed 
t-test (ShownifP<0.05).g,h, Western blots of Ripk1-knockout MEFs 
complemented with: GFP; wild-type RIPK1; D325V, D325H or D138N mutant; or 
D138N/D325H or D138N/D325V double mutants, treated as indicated. HA, 
haemagglutinin; LE, long exposure; SE, short exposure. For gel source data, see 
Supplementary Fig. 1. Results are representative of three independent 
experiments. i, Cell viability of RipkP? 4 Ripk3 and RipkI''Ripk3~ MEFs 
treated as indicated for 24h. Dataare mean +s.e.m.,n=3. Circles correspond to 
each independent experiment. Pvalues determined by unpaired two-tailed 
t-test.j, Western blots of Ripk1??4"4 Ripk3 and Ripk1"Ripk3” MEFs 
treated as indicated. For gel source data, see Supplementary Fig. 1. Results are 
representative of three independent experiments. k, /J6 mRNA expression of 
Ripk1-knockout MEFs complemented with: GFP; wild-type RIPK1; or D325V or 
D325H mutant, treated as indicated. Dataare mean +s.e.m.,n=3. Circles 
correspond to each independent experiment. Pvalues determined by unpaired 
two-tailed ¢-test.1,m, qPCR analysis of //6 and Cxcl2 (I) or 116, Cxcl2 and Tnf(m) 
expression in Ripk 1°54 Ripk37- and RipkI*Ripk3- MEFs treated as 
indicated for 2 or 4h. Dataare mean+s.e.m.,n=3. Circles correspond to each 
independent experiment. Pvalues determined by unpaired two-tailed t-test. 


IL-6 blockade, we also compared the effects of //6 expression in Ripk1- 
knockout MEFs expressing GFP alone, wild-type RIPK1 or mutant 
RIPK1(D325V or D325H) (Fig. 3k). We found that MEFs expressing the 
D325V or D325H mutant showed distinctively enhanced transcription 
of 1/6 compared to that of wild-type-complemented Ripk1-knockout 
MEFs inresponse to TNF alone. In addition, the transcriptional pro- 
duction of //6, Cxcl2 and Tnf were also enhanced in Ripk1?°?42"4 
Ripk3 MEFs after stimulation by TNF or both TNF and SM-164. The 
enhancement was inhibited by the addition of Nec-1s (Fig. 31, m). 
Together, these results suggest that the augmented inflammatory 
signals associated with the non-cleavable variants were dependent 
on RIPK1 kinase activity. In keeping with the patient’s therapeutic 
response to IL-6 blockade, these results demonstrate a pathogenic 
mechanism that relies on the activation of RIPK1 to mediate the pro- 
duction of IL-6. 
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Fig. 4 | Necroptosis and ferroptosis are repressed in the patient fibroblasts. 
a, Cell viability of fibroblasts from Pl and seven paediatric unaffected controls 
after treatment with indicated stimulation for 24 h. N, Nec-1s;S, SM-164; T, TNF; 
Z, Z-VAD-FMK. Data are mean +s.e.m. Circles correspond to each tested 
individual. Analysis of eachsample was performed intriplicate. b, Western 
blots of fibroblasts from patient P1and four paediatric unaffected controls 
after treatment with indicated stimulation for 6h. For gel source data, see 
Supplementary Fig. 1. Results are representative of three independent 
experiments. VAD, Z-VAD-FMK.c, Patient and seven paediatric unaffected 
control fibroblasts were treated as indicated for 6h. The mRNA levels of 
cytokines were measured by qPCR. Data are mean +s.e.m. Circles correspond 
to each tested individual. Analysis of each sample was performed in triplicate. 
d, Western blots of RIPK1and TNFR1 protein levels at basal state and after TNF 
stimulation in fibroblasts from P1 and four paediatric unaffected controls. For 
gel source data, see Supplementary Fig. 1. Results are representative of three 
independent experiments. e, Cell viability of patient fibroblasts treated with 
erastin, RSL3 or FINS6 compared with six paediatric unaffected controls. Data 


Necroptosis and ferroptosis resistance in fibroblasts 


Notably, we observed the opposite response to inducers of cell 
death in fibroblasts from patient P1 compared with MEFs and patient 
PBMCs. The patient fibroblasts showed resistance to cell death after 
stimulation with TNF or LPS plus SM-164 and Z-VAD-FMK (Fig. 4a, 
Extended Data Fig. 6a). The cell death resistance was further demon- 
strated by reduced phosphorylation of RIPK1 and MLKL in the patient 
fibroblasts (Fig. 4b, Extended Data Fig. 6b, c). Patient fibroblasts 
also showed diminished gene expression of /L6, /L1B and pro-inflam- 
matory chemokines CXCL2 and CXCL3 in response to TNF, SM-164 
and Z-VAD-FMK stimulation (Fig. 4c). Levels of RIPK1 protein under 
basal and stimulated conditions were lower in patient fibroblasts 
than that of controls (Fig. 4d), and the reduction was rescued by 
the Nec-1s (Extended Data Fig. 6d). We observed reduction of R/PK1 
at the transcriptional level in fibroblasts (Extended Data Fig. 4e, 
Extended Data Fig. 6e). Together, these results suggest that decreased 
RIPK1 expression may compensate for the presence of the RIPK1- 
activating variant in patient fibroblasts. In addition, we found that 
patient fibroblasts exhibited reduced expression of TNFR1, which 
may provide a further mechanism for the decreased sensitivity to 
TNF (Fig. 4d, Extended Data Fig. 6f). Patient fibroblasts also showed 
decreased expression of genes involved in cell death pathways such 
as RIPK1 and R/PK3 and a different gene expression pattern (Extended 
Data Fig. 6g) compared to PBMCs (Fig. 2d). Together, these findings 
provide evidence of compensatory mechanisms to resist cell death 
in the patient fibroblasts. 


are mean+s.e.m. Circles correspond to each tested individual. Analysis of each 
sample was performed in triplicate. f, Western blots of fibroblasts from Pland 
three paediatric unaffected controls after treatment with erastin for 4 or 8h. 
For gel source data, see Supplementary Fig. 1. Results are representative of 
three independent experiments. g, Expression patterns of genes involvedin 
ferroptosis and antioxidant by RNA sequencing of fibroblasts from patient and 
three paediatric unaffected controls. Analysis of each sample was performedin 
duplicate. For gene names, see Supplementary Fig. 2.h, GSH concentrations in 
fibroblasts from P1 compared with six paediatric unaffected controls at 
baseline and after treatment with erastin or glutamate for 8h. Data are 

mean +s.e.m. Circles correspond to each tested individual. Analysis of each 
sample was performed in triplicate. i, Immunofluorescence (left) and relative 
fluorescent intensity (right) of cytosolic ROS (green foci) in patient fibroblasts 
after treatment with erastin for 8 h compared with that of three paediatric 
unaffected controls. Scale bar, 150 ppm. Circles correspond to each tested 
individual sample. Analysis of each sample was performed in duplicate. 


We also characterized the sensitivity of patient fibroblasts to other 
cell death stimuli. Notably, we found that the patient fibroblasts were 
highly protected against ferroptosis induced by erastin, RSL3 or FIN56 
(Fig. 4e)—an effect that was not found in patient PBMCs or MEFs express- 
ing the RIPK1 D325V or D325H mutant (Extended Data Fig. 4f, Extended 
Data Fig. 5h). Consistent with these findings, erastin-induced degra- 
dation of GPX4“ was blocked in patient fibroblasts (Fig. 4f), but not 
in RipkI MEFs expressing mutant RIPK1 (Extended Data Fig. 5i). To 
explore the mechanism of ferroptosis resistance, we analysed gene 
expression in patient fibroblasts by RNA sequencing. We found that 
the expression of several genes involved in inhibiting ferroptosis—such 
as SLC7AII, CISD1 and CD44'°—were upregulated in patient fibroblasts 
(Fig. 4g). This pattern was not observed in the patient PBMCs or MEFs 
(Extended Data Figs. 4g, h, 5j). Similarly, the concentration of the anti- 
oxidant glutathione (GSH) was much higher in patient fibroblasts than 
that of controls (Fig. 4h). By contrast, GSH levels were similar in PBMCs 
or MEFs expressing wild-type or mutant RIPK1 (Extended Data Figs. 4i, 
5k). Consistent with increased levels of GSH, the amounts of reactive 
oxygen species (ROS) (as indicated by the cytosolic ROS sensor carboxy- 
H,DCFDA) were lower after erastin stimulation in patient fibroblasts 
(Fig. 41), but not in Ripk1-knockout MEFs complemented with mutant 
RIPK1 (Extended Data Fig. 51). These data suggest that restricted release 
of ROS by the patient fibroblasts may help to protect against ferropto- 
sis, as ROS is known to be crucial for mediating ferroptosis”*. Similarly, 
because ROS production can promote RIPK1 activation and necropto- 
sis’”’8, the high levels of antioxidant GSH in the patient fibroblasts may 
also contribute to the resistance to necroptosis. 
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Discussion 


Our study identified a dominantly inherited autoinflammatory disease 
caused by impaired caspase-8 cleavage in RIPK1. This condition is dis- 
tinct from the previously reported recessively inherited R/PK1-deficient 
condition that is characterized by immune deficiency”. By contrast, 
we show that patients with one copy of mutated RIPK1in the caspase-8 
cleavage site present with symptoms of immune dysfunction, including 
recurrent fevers and lymphadenopathy. 

Our data highlight the role of RIPK1 kinase activity in promoting not 
only both apoptosis and necroptosis but also transcriptional produc- 
tion of pro-inflammatory cytokines, such as IL-6, whichis a previously 
underappreciated aspect of RIPK1 biology. These results suggest that 
the periodic fevers of these patients may reflect the augmented produc- 
tion of cytokines such as IL-6 in response to what may be benign stimuli 
for normal individuals. Activated RIPK1 has been shown to mediate 
transcription of pro-inflammatory cytokines in myeloid lineages, inde- 
pendent of cell death, in neurodegenerative diseases”. In addition, 
cytokines such as TNF in turn can further promote cell death, thus 
establishing a vicious circle of inflammation that culminates in the 
development of an autoinflammatory disease. 

We show that patient fibroblasts may have developed several com- 
pensatory mechanisms to protect against deleterious effects of acti- 
vated RIPK1, including downregulating the expression of RIPK1 and 
TNFRI, as well as promoting anti-ROS mechanisms. These findings 
provide insights into the complex disease mechanisms behind non- 
cleavable RIPK1 variants in humans compared to that of the mouse 
models. Our study also linked an activating RIPK1 variant to ferroptosis, 
which sheds light on the diverse roles of RIPK1in regulating several 
cell death pathways. 
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Methods 


Patients 

Patient P1 was evaluated under protocols approved by the Insti- 
tutional Review Board by Children’s Hospital of Fudan University. 
Patients P2, P3, P4 and PS and their unaffected family members were 
evaluated at McMaster Children’s Hospital, and the Hospital for Sick 
Children. Signed consent for their clinical information to be shared 
and for research samples to be sent to Boston Children’s Hospital 
was obtained. Ethics clearance was received from the Institutional 
Review Board at Boston Children’s Hospital and from Western Insti- 
tutional Review Board. All relevant ethical regulations were followed. 
All patients and/or substitute decision markers provided written 
informed consent. 


Unaffected controls 

We used unaffected controls for functional assays. Paediatric unaf- 
fected controls are less than 10 years old and had no symptoms of 
inflammation when sampling. 


WES 

DNA from whole blood was extracted using a Maxwell RSC Whole 
Blood DNA Kit (Promega, AS1520). One microgram of DNA was used 
for whole-exome sequencing. For the first family, WES and data analy- 
sis were performed as previously described”. Variants were anno- 
tated by ANNOVAR (2018Apr16). Candidate variants were filtered to 
remove those presenting in the gnomAD, Kaviar, dbSNP and an in- 
house database. Variants were further filtered by de novo or dominant 
inheritance. For the second family, WES was performed and analysed 
concurrently for the proband, both parents and one affected son as 
previously described”®. Other affected or unaffected family members 
were tested by Sanger sequencing for the presence or absence of the 
de novo variant identified in the proband. 


Sanger sequencing 
Sanger sequencing was used to confirm variants identified by exome 
sequencing as previously described??. 


Cell preparation, culture and stimulation 

The HEK293T cell line was from the American Type Culture Collec- 
tion. Ripk1 gene knockout MEFs were established from RipkI” mice. 
MEFs derived from D325A knock-in mice were provided by J. Zhang. 
PBMCs were separated by lymphocyte separation medium (LSM) and 
SepMate tubes (Stemcell) according to the manufacturer’s instruc- 
tions. Fibroblasts were derived from skin biopsies of patient and 
control donors. HEK293T cells, MEFs and fibroblasts were grown 
in DMEM (Gibco) supplemented with 10% fetal bovine serum (FBS) 
(ExCell Bio) and penicillin/streptomycin (HyClone). PBMCs were 
grown in RPMI-1640 (Gibco) supplemented with 10% FBS and peni- 
cillin/streptomycin. All cell lines tested negative for mycoplasma 
contamination. 

Recombinant human TNF (Peprotech, 300-01A) was used to stimulate 
PBMCs (50 ng mI7, 100 ng mI’), fibroblasts (20 ng mI‘, 50 ng mI) and 
MEFs (50ng mI) for the indicated amount of time. LPS (Sigma, L6529) was 
used to stimulate PBMCs (1 1g mI), MEFs (Lug mI”) and fibroblasts (lng 
ml°) forthe indicated amount of time. TRAIL (R&D, 1121-TL) was used to 
stimulate MEFs (100 ng mI”) for the indicated amount of time. Z-VAD-FMK 
(100 pM) and SM-164 (50 nM) (from Selleck) and Nec-1s (10 1M) (made 
by custom synthesis) were used to treat PBMCs, MEFs and fibroblasts. 
Erastin and RSL3 were used to induce cell ferroptosis in PBMCs (10 uM, 
11M), MEFs (10 uM, 0.5 pM) and fibroblasts (10 pM, 0.5 LM). 


RNA sequencing 
One microgram of RNA was used for library preparation. Libraries were 
generated using NEBNext Ultra RNA Library Prep Kit for Illumina (NEB) 


following manufacturer’s recommendations and index codes were added 
to attribute sequencesto each sample. Library quality was assessed onthe 
Agilent Bioanalyzer 2100 system. The libraries were sequenced on Illu- 
mina Novaseq and 150-bp paired-end reads were generated. Sequenced 
reads were mapped against the human reference genome (GRCh38) or 
mouse reference genome (GRCm38) using HISAT2. featureCounts was 
used to count the reads numbers mapped to each gene. Differential 
expression analysis was performed using the DESeq2 R package. 


Single-cell RNA sequencing 

10X Genomics Chromium machine was used for 8,000-10,000 single-cell 
capture and cDNA preparation. The machine divided thousands of cells 
into nanolitre-scale Gel Bead-In-EMulsions for barcoding followed by clean 
up using the silane magnetic beads and Solid Phase Reversible Immobili- 
zation beads. Barcoded cDNA was then amplified by PCR. The library was 
constructed according to the manufacturer’s instruction. Sequencing 
was carried out on Illumina Novaseq. Sequence data were processed with 
Cell Ranger V3.0.1(10X Genomics). The resulting count matrices followed 
the standard pipeline with default parameters. The UMAP plots were cal- 
culated based on the first 20 components of the CCA, and clusters were 
identified by Seurat R package (https://satijalab.org/seurat/). 


NanoString assay 

One-hundred nanograms of total RNA was used for NanoString assay 
and gene expression analysis was conducted using the nCounter Analy- 
sis System (NanoString Technologies) with a codeset designed to target 
594 immunologically related genes. NanoString assay and data analysis 
were performed as previously described”. 


Quantitative RT-PCR assay 

Total RNA from fibroblasts, MEFs and PBMCs was extracted using the 
RNeasy Mini kit (Qiagen, 74104). cDNA was generated by the Prime- 
Script RT reagent kit with gDNA Eraser (Perfect Real Time) (Takara, 
RRO47A), and qPCR was performed using TB Green Premix Ex Taq II 
(TliRNaseH Plus) (Takara, RR820A). The reactions were run on Applied 
Biosystems 7500 Real-Time PCR System (Life Technologies) and ROCHE 
480II. Relative mRNA expression was normalized to ACTB or GAPDH 
and analysed by the AAC, method. 


Antibodies and expression plasmids 
The following antibodies were purchased from Cell Signaling Technol- 
ogy: B-actin (4970), B-tubulin (86298), GAPDH (5174), RIPK1 (3493), 
p-RIPK1 (Ser166) (65746), MLKL (14993), p-MLKL (Ser358) (91689), 
p-MLKL (Ser345) (37333), p65 (8242), p-p65 (Ser65) (3033), IKKa 
(11930), IKKB (2370), p-IKKa/B (Ser176/180) (2697), IkBa (4814), p-IKBa 
(Ser32) (2859), p38 (8690), p-p38 (Thr180/Tyr182) (4511), TNFR1 (3736), 
caspase-8 (4790), cleaved-caspase-8 (8592), caspase-3 (9662), cleaved- 
caspase-3 (Asp175) (9661), HA-tag (3724), SLC7A11 (12691). Cyclophilin 
A (ab41684), GPX4 (ab125066), LAMP2A (ab125068), COX2 (ab15191) 
and p53 (ab32389) were purchased from Abcam. FADD (sc-6036) and 
ACSL4 (sc-365230) were purchased from Santa Cruz Biotechnology. 
p-RIPK1 (Ser166) (BX60008) was made by Biolyx. HSC70 (10654-1-AP) 
was purchased from Proteintech Group. HSP90 (BF9107) was purchased 
from Affinity. MLKL (reactivity for Mus musculus) was homemade”’. 
Human wild-type RIPK1 plasmid (RC216024) was from Origene, 
and the mutant RIPK1 plasmids (D324V, D324H and D324K) were con- 
structed by site-directed mutagenesis. Mouse wild-type RIPK1 plasmid 
was generated by PCR amplification from the cDNAs of MEFs, and then 
cloned into the pMSCV vector made in-house, and the mutant mouse 
RIPK1 plasmids (D325V and D325H) were constructed by site-directed 
mutagenesis. 


Immunoprecipitation and western blotting 
Cells were lysed in cold cell lysis buffer (20 mM Tris-HCl, pH 7.4, 150 mM 
NaCl, 0.5% NP-40, protease and phosphatase inhibitor mixture (Thermo 
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Fisher, 78442) and 10% glycerol) for 10 min and centrifuged at 20,000g 
for 10 min. Protein concentration was measured on the cleared lysates 
by BCA protein assay kit (Thermo Fisher, 23225). Immunoprecipitation 
and immunoblotting were conducted as described previously with 
specific antibodies””>. 


In vitro RIPK1 cleavage assay 

Unlabelled in vitro transcription and translation (IVTT) of 1 pg wild- 
type and mutant RIPK1 constructs were performed in 50 ul reactions 
using the TNT T7 Quick Coupled Transcription/Translation System 
(Promega, L1170). The reaction was incubated with purified recom- 
binant caspase-8 protein (R&D, 705-C8/CF) and then immunoblotted 
with RIPK1 antibody. 


Cytokine detectioninserum 

The concentrations of cytokines in serum were measured by BD Cyto- 
metric Bead Array. Cytokine concentrations for IL-6, TNF and IL-10 in the 
serum were determined by BD Cytometric bead arrays (BD Bioscience). 
All data were analysed by FCAPArray V3 software (BD Biosciences). 


Flowcytometry analysis of phosphorylation 

For phos-flow staining, isolated PBMCs were treated with or without LPS 
(lug mI) for 6 hat 37 °C, with 5% CO, and then permeabilized with Perm 
Buffer II] according to the manufacturer’s instructions (BD Biosciences). 
Surface marker CD3, CD14 and CD19 (BD Biosciences) were used to gate 
total T cells, monocytes and total B cells. The expression of p-STAT3, 
p-p65 and p-p38 were analysed by flow cytometry. For phos-flow analy- 
sis, the following antibodies were used: Alexa Fluor 647-conjugated 
antibody against STAT3 phosphorylated at Y705 (BD Biosciences), Alexa 
Fluor 488-conjugated antibody against NF-KB p65 phosphorylated at 
$529 (BD Biosciences) and Alexa Fluor 488-conjugated antibody against 
p38 phosphorylated at T180/Y182 (BD Biosciences). Isotype control 
antibodies were used to normalize the background signals for intracel- 
lular staining. All events were acquired on a FACS Canto II cytometer 
(BD Biosciences) and analysed with FlowJo (Tree Star). Blue lines in 
the Extended Data Fig. 2c indicate basal levels, orange lines indicate 
LPS stimulation for 6 hand red lines indicate an isotype control. The 
numbers mark the percentage of cells displaying phosphorylation of 
STAT3 or p38 based on comparison with isotype control staining for 
each cell type. 


Intracellular cytokine staining 

Intracellular cytokine staining for IL-6, TNF and IL-8 were measured in 
PBMCsat baseline and following LPS stimulation. Cells were washed twice 
with PBS, then treated with LPS (1 pg mI“ per 1 x 10° PBMCs) and Golgi 
plug (BD Biosciences) for 6 hat 37 °C, with 5% CO, and then permeabilized 
with Perm/Fix for 30 min at 4 °C. Cells were stained by antibodies CD3- 
Percp-cy5.5 (BD Biosciences), CD14-PE-CY7 (BD Biosciences), CD4-FITC 
(BD Biosciences), CD19-APC (BD Biosciences), IL8-PE (Biolegend), IL10- 
BV421 (Biolegend), IL6-Percp-cy5.5 (Biolegend) and TNF-V450 (Bioleg- 
end). Isotype control antibodies were used to normalize the background 
signal for intracellular staining. All events were acquired on a FACS Canto 
Il cytometer and analysed with FlowJo (Tree Star). 


Cell viability assay 

General cell survival was measured by the ATP luminescence assay 
CellTiter-Glo (Promega). The percentage of viability was normalized 
to readouts of untreated cells. 


Cell death assay 

Cell death was determined by ToxiLight Non-destructive Cytotoxicity 
BioAssay Kit (Lonza, LTO7) or SYTOX Green Nucleic Acid Stain (Thermo 
Fisher, S7020). All experiments were conducted on 384-well plates 
with at least three biological replicates. Data were collected by the 
multimode plate reader (Bio Tek). 


Intracellular ROS detection 

Cells were seeded in 12-well plates and treated with the indicated stim- 
uli for the indicated amount of time. After cell death induction, 5 uM 
carboxy-H,DCFDA was added to cells for 30 min at room temperature. 
Cells were then returned to warm growth medium and incubated for 
15 min, followed by replacement of growth medium with PBS. Images 
were taken using a Leica fluorescence microscope. 


Intracellular GSH detection 

The GSH concentration in cells was assessed by GSH-Glo Glutathione 
Assay Kit (Promega, V6911) according to the manufacturer’s instruc- 
tions. 


Statistics 

No statistical methods were used to predetermine sample size. For 
cell-based experiments, biological triplicates were performed in each 
single experiment in general, unless otherwise stated. All values were 
expressed as mean + s.e.m. and calculated from the average of at least 
three independent biological replicates unless specifically stated. 
Statistical analysis was performed using GraphPad Prism 8 software 
(GraphPad Software). For comparisons between two groups, the Stu- 
dent’s t-test (unpaired and two-tailed) was applied. In all tests, a 95% 
confidence interval was used, for which P< 0.05 was considered a sig- 
nificant difference. Statistical analysis of single-cell RNA sequencing 
and RNA sequencing was performed using R Software (R v.3.5.2). 


URLs 

ANNOVAR, http://annovar.openbioinformatics.org/en/latest/user- 
guide/download/; CADD, https://cadd.gs.washington.edu/; gnomAD, 
https://gnomad.broadinstitute.org/; Kaviar genomic variant database 
(Kaviar), http://db.systemsbiology.net/kaviar/; Sorting Intolerant from 
Tolerant (SIFT), https://sift.bii.a-star.edu.sg/; PolyPhen-2, http://genet- 
ics.bwh.harvard.edu/pph2/; likelihood ratio test (LRT), http://www. 
genetics.wustl.edu/jflab/Irt_query.html; MutationTaster, http://www. 
mutationtaster.org/. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 
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Extended Data Fig. 1| Identification of RIPK1 variants and their effects. 

a, Schematic of the WES data-filtering approach under the assumption of 

de novo inheritance in family 1, leading to the identification of anovel R/PK1 
variant. INDEL, frameshift or non-frameshift insertions and deletions; SNP, 
single nucleotide polymorphisms including missense, splice-site and stop- 
codon variants. b, Exome sequencing reads covering the D324V variant in 
family 1, displayed by the integrative genomics viewer. c, Confirmation of 
RIPK1 variants at residue Asp324 for patients P1-P5 by Sanger sequencing. 

d, Evolutionary conservation of the caspase-8 cleavage site D324 in RIPK1. 
Amino acid sequence of RIPK1 flanking D324 was aligned by ClustalW across 
various species. e, Schematic domain structure of RIPK1. The position of 
identified variants leading to defective caspase-8 cleavage is indicated. f, In 
silico analysis of novel RIPK1 variants at D324. cDNA positions are determined 
according to the reference sequence NM_003804. Four predictions including 
SIFT, PolyPhen-2, LRT and Mutation Taster annotated by ANNOVAR were 
included in the analysis. D indicates damaging or deleterious variant. The 
gnomAD database includes 123,136 exomes and 15,496 genomes. The Kaviar 
database includes 77,238 exomes and genomes. g, RIPK1 cleavage site variants 
caused defective cleavage in vitro. HEK293T cells were transiently transfected 
with wild-type or mutant RIPK1 plasmids followed by immunoblotting of cell 
lysates. EV, empty vector; H1, H2, different cloning plasmids of RIPK1(D324H) 
variant; V1, V2, different cloning plasmids of RIPK1(D324V) variant; WT, wild- 
type RIPK1 plasmid. For gel source data, see Supplementary Fig. 1. Results are 
representative of three independent experiments. h, The degradation of wild- 
type and D325A mutant RIPK1 protein was analysed by CHX chase assay. Top, 


Ripk1??5454 Rink3 and Ripk1’Ripk3 MEFs were incubated with 50 pg mI 
CHxX for the indicated period of time and collected for western blot. Results are 
representative of three independent experiments. The western blot was 
quantified by ImageJ. Bottom, the relative RIPK1 protein level was normalized 
to CHX-untreated cells. Dataare mean +s.d.,n=3. Circles correspond to each 
independent experiment. For gel source data, see Supplementary Fig. 1. 

i, Neither mutant disrupted the recruitment of RIPK1and caspase-8 into the 
FADDosome. Ripk1-knockout MEFs complemented with wild-type RIPK1, or 
D325V or D325H mutant were treated as indicated for 1or3h.T/Z,50ng mI* 
TNF, 50 uM Z-VAD-FMK. ‘+’ denotes 20 1M Nec-Is. Lysates were 
immunoprecipitated with anti-FADD, and analysed by immunoblotting using 
the indicated antibodies. For gel source data, see Supplementary Fig. 1. Results 
are representative of three independent experiments. j, Unlabelled in vitro 
transcription and translation of wild-type and mutant RIPK1 constructs 
(D324V, D324H and D324K) were performed inthe TNT T7 Quick Coupled 
Transcription/Translation System followed by incubation with purified 
recombinant caspase-8 protein for 3 h and then analysed by immunoblotting of 
RIPK1. CL, cleaved RIPK1; FL, full-length RIPK1. For gel source data, see 
Supplementary Fig. 1. Results are representative of three independent 
experiments. k, The D324V variant disrupted the RIPK1 cleavage by caspase-8. 
Fibroblasts from patient Pl and an unaffected control were treated with TNF 
and CHxX for the indicated amount of time followed by immunoblotting 
analysis. For gel source data, see Supplementary Fig. 1. Results are 
representative of three independent experiments. 
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Extended Data Fig. 2|See next page for caption. 
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Extended Data Fig. 2| Increased pro-inflammatory signalling in patient 
PBMCs. a, Intracellular cytokine staining of patient PBMCs showed expression 
of IL-6, TNF and IL-8 in CD14* monocytes at baseline (unstimulated, UNS) (top) 
and after LPS stimulation for 6 h (bottom) compared to one representative 
(left) and four paediatric unaffected controls (right). Dataare mean +s.e.m. 
Circles correspond to each tested individual. b, Intracellular cytokine staining 
of patient PBMCs showed increased expression of IL-6 in patient CD3*CD4* 
Tcells after LPS stimulation (1 pg mI) for 6 hcompared to 1 representative 
(left) and 4 paediatric unaffected controls (right). Data are mean+s.e.m. 
Circles correspond to each tested individual. c, Basal phosphorylation of 
STAT3 (top) and phosphorylation of p38 after LPS stimulation (bottom) of 
patient monocytes (CD14"), B cells (CD19*), and T cells (CD3*) compared to one 
representative (left) and four paediatric unaffected controls (right) as 
determined by flowcytometry analysis. Data are mean +s.e.m. Circles 
correspond to each tested individual. d, Single-cell RNA sequencing revealed a 
higher percentage of monocytes in P1 compared with an age- and sex-matched 
unaffected control (Cl). e, f, Single-cell RNA sequencing revealed that the NF- 
kB (e) and typel IFN (f) signalling pathways were upregulated in patient CD14* 
and CD16* monocytes compared with an age- and sex-matched unaffected 
control (C1) and an adult control (AC). The adult control data were downloaded 


from 10X Genomics. Analysis of patient sample was performed in duplicate. For 
gene names, see Supplementary Fig. 2. g, Violin plots showing the distribution 
of gene expression of selected genes in different cell clusters for Pland an age- 
and sex-matched unaffected control.n=1,007 cells for patient andn=4,340 
cells for control CD4 T cell cluster. n =1,868 cells for patient and n=1,427 cells 
for control B cell cluster. n= 422 cells for patient and n=1,478 cells for control 
CD8T cell cluster.n=302 cells for patient and n=1,101 cells for control NK cell 
cluster. n=1,125 cells for patient and n= 241 cells for control CD14 monocytes 
cluster. n=1,184 cells for patient andn=78 cells for control CD16 monocytes 
cluster.n=371 cells for patient and n=563 cells for control T memory cell 
cluster. n=333 cells for patient and n=524 cells for control y6 T cell cluster. 
n=249 cells for patient and n=597 cells for control NK cell/T doublets cluster. 
n=182 cells for patient and n= 228 cells for control B activated cell cluster. 
n=357 cells for patient and n=20 cells for control dendritic cell cluster.n=117 
cells for patient and n= 217 cells for control naive B cell cluster. n=279 cells for 
patient andn=35 cells for control erythrocytes cluster. n=29 cells for patient 
andn=86 cells for control megakaryocyte cell cluster. n= 78 cells for patient 
and n=23 cells for control plasma cell cluster. n = 33 cells for patient andn=34 
cells for control plasmacytoid dendritic cell cluster. The PBMCs from P1 for in 
a-g were obtained duringa fever episode. 
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Extended Data Fig. 3| Increased expression of genes in NF-KB and type-I IFN cells). b, Visualization of expression of genes involved in type-IIFN signalling 
pathways in patient PBMCs. a, Visualization of expression of genes involved in pathway (/F/6, IFI30, OAS1, GBP1, IFI27, ISG15, OAS2, SOCS1, IF144, IRF7, OAS3, 
NF-KB pathway (SOCS3, CD40, TLR2, ILIRN, IL6, CD38, TAP1,IL15,ICOS, CD83, LY6E, IFI44L, MX1, OASL, RSAD2, IFIT3, HERCS, USP18, IFIT2, HERC6 and EPSTI1) 
TNF, IFNG, CXCL2, FCGRT, TNFSF10,ICAMI, IGHGI, TNFRSFIB, CCL3, IGHG4 and (coloured single cells) on UMAP plot projecting PBMCs from patient P1 
TREM1) (coloured single cells) on UMAP plot projecting PBMCs from patient P1 (n=7,936 cells) and an age- and sex-matched unaffected control (n=10,992 
(n=7,936 cells) and an age- and sex-matched unaffected control (n=10,992 cells). The PBMCs from Plina and b were obtained during a fever episode. 
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Extended Data Fig. 4 | Patient PBMCsare sensitive to TNF-induced cell death 
but not to ferroptosis. a, qPCR of PBMCs confirmed comparative expression 
levels of cytokine and chemokine-related genes in P1, after 4 months of 
tocilizumab treatment, compared to 8 paediatric unaffected controls. Data are 
mean+s.e.m. Circles correspond to each tested individual. Analysis ofeach 
sample was performed in triplicate. b, Patient PBMCs were hypersensitive to 
TNF-induced cell death. PBMCs from 8 age-matched unaffected controls and 
patients Pl and P3 were treated as indicated for 24h. N, 20 uM Nec-1s;S,100 nM 
SM-164; T, 100 ng mI TNF; Z, 100 uM Z-VAD-FMK. Cell death was measured by 
ToxiLight assay. Data are mean +s.e.m. Circles correspond to each tested 
individual. Analysis of eachsample was performed in triplicate. c, Induction of 
necroptosis and apoptosis by TNF in the patient PBMCs. PBMCs from patient P5 
anda paediatric unaffected control were treated with indicated stimulation for 
24 hbefore cell lysates were analysed by immunoblotting. For gel source data, 
see Supplementary Fig. 1. Results are representative of two independent 
experiments. d, Patient PBMCs stimulated with TNF alone exhibited 
upregulated gene expression of inflammatory signals, which was reduced by 
Nec-1s. PBMCs of patient and 2 unaffected controls were treated with100 ng 
ml“ TNF or TNF plus 20 pM Nec-Is for 24 h before being analysed by 
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NanoString. e, The transcription levels of R/PK1 in PBMCs from patient and 
unaffected controls measured by qPCR. Data are mean + s.e.m. Circles 
correspond to each tested individual. Analysis of each sample was performed 
in triplicate. f, PBMCs of patient and two paediatric unaffected controls 
showed similar responses to RSL3-induced ferroptosis. Circles correspond to 
each tested individual. Analysis of each sample was performed in triplicate. 

g, RNAsequencing of patient PBMCs indicated no difference in expression 
patterns of genes involved in ferroptosis and antioxidant when compared to 
three paediatric unaffected controls. Analysis of each sample was performedin 
duplicate. h, Single-cell RNA sequencing did not reveal distinct expression 
patterns of genes involved in ferroptosis and antioxidant in patient CD14" and 
CD16* monocytes compared with an age- and sex-matched unaffected control 
(C1) and an adult control (AC). The adult control data were downloaded from 
10X Genomics. i, GSH concentration in PBMCs from P1 was similar to three 
paediatric unaffected controls. Dataare mean+s.e.m. Circles correspond to 
each tested individual. Analysis of each sample was performed in triplicate. 
The PBMCs fora, b, d, fandi were obtained during remission. The PBMCs forc, 
e,g and hwere obtained during a fever episode. 
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Extended Data Fig. 5| See next page for caption. 


Extended Data Fig. 5| RIPK1 cleavage site variants in MEFs promote cell 
death and inflammatory response but have no protective effect against 
ferroptosis. a, b, RIPK1 cleavage site variants in MEFs promote cell death 
induced by TNF. Ripk1”” MEFs were complemented with: GFP; wild-type RIPK1; 
or D325V or D325H mutant, and treated for 12h (a) or as indicated (b).N, 20 1M 
Nec-1s;S,20 nMSM-164; T, 50 ng mI TNF; Z, 50 uM Z-VAD-FMK. Cell viability 
and cell death were measured by CellTiter-Glo assay (a) and ToxiLight assay (b), 
respectively. Data are mean+s.e.m,n=3. Circles correspond to each 
independent experiment. Pvalues were determined by unpaired, two-tailed ¢- 
test (shown if P<0.05).c, Western blots illustrating increased levels of p-S166- 
RIPK1, p-S345-MLKL and cleaved caspase-3 after stimulation with TNF and SM- 
164, which were inhibited by Nec-Is. RipkI”” MEFs complemented with: GFP; 
wild-type RIPK1; or D325V or D325H mutant, were treated as indicated. Cell 
lysates were analysed by immunoblotting using indicated antibodies. For gel 
source data, see Supplementary Fig. 1. Results are representative of three 
independent experiments. d, e, D325A knock-in Ripk1 mutation sensitizes 
Ripk3 MEFs to TNF-induced RIPK1-dependent apoptosis. Ripk1?54"3254 
Ripk3* and Ripk1“*Ripk3 MEFs were simulated with TNF only, TNF plus Nec- 
1s (d) ora combination of TNF, SM-164 and Nec-Is (e) as indicated 
(concentrations as ina). Cell death was measured by the SYTOX Green Nucleic 
Acid Stain assay. Data are mean+s.e.m.,n=4. Circles correspond to each 
independent experiment. Pvalues determined by unpaired two-tailed t-test, 
and indicate the comparison between Ripk1??>""?4 Ripk3* and 
Ripk1*Ripk3 MEFs after TNF or TNF plus SM-164 stimulation for indicated 
amount of time. f, RIPK1 cleavage site variants in MEFs sensitize TRAIL-induced 
cell death. RipkI MEFs complemented with: GFP, wild-type RIPK1, or D325V or 
D325H mutant were treated with TRAIL (100 ng mI*) or TRAIL plus Nec-1s 


(20 uM) for 36h. Dataare mean +s.e.m.,n=3. Circles correspond to each 
independent experiment. Pvalues determined by unpaired two-tailed t-test. 
g, TRAIL stimulation of RipkI”” MEFs complemented with D325V or D325H 
mutant promotes RIPK1 activation, which was inhibited by Nec-1s (20 pM). 
TRAIL (100 ng ml”); TNF (50 ng mI“) for 12h. For gel source data, see 
Supplementary Fig. 1. Results are representative of three independent 
experiments. h, RipkI-knockout MEFs complemented with wild-type RIPK1, or 
D325V or D325H mutant plasmid showed similar responses to erastin- or RSL3- 
induced ferroptosis. Dataare mean +s.e.m.,n=3. Circles correspond to each 
independent experiment. i, Western blots of proteins involved in ferroptosis in 
RipkI MEFs complemented with wild-type RIPK1, or D325V or D325H mutant. 
Cells were treated with erastin for 5 or 10h, followed by immunoblotting of cell 
lysates. For gel source data, see Supplementary Fig. 1. Results are 
representative of three independent experiments. j, RNA sequencing of 
RipkI MEFs complemented with wild-type RIPK1, D325V or D325H mutant 
indicated no difference in expression patterns of genes involved in ferroptosis 
and antioxidant. Analysis of each sample was performed in duplicate. k, GSH 
concentration of RipkI” MEFs complemented with wild-type RIPK1, D325V or 
D325H mutant was similar both at baseline and after erastin or glutamate 
stimulation for 8h. Data are mean +s.e.m.,n=3. Circles correspond to each 
independent experiment. I, Immunofluorescence showed similar levels of 
cytosolic ROS after erastin stimulation in RipkI”” MEFs complemented with 
wild-type RIPK1, D325V or D325H mutant. Cells were treated by erastin for 8h 
before incubation with the cytosolic ROS sensor carboxy-H,DCFDA. Green foci 
indicate cytosolic ROS. Scale bar, 150 um. Results are representative of two 
independent experiments. 
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Extended Data Fig. 6 | Patient fibroblasts were resistant to both necroptosis 
and ferroptosis. a, Patient fibroblasts were resistant to necroptosis induced 
by SM-164, Z-VAD-FMK and TNF or LPS. Fibroblasts from Pland seven 
paediatric unaffected controls were treated as indicated for 24h. LPS, lug mI“; 
N, 10 uM Nec-1s; S, 50 nM SM-164; T, 50 ng mI! TNF; Z, 50 uM Z-VAD-FMK. Cell 
death was measured by ToxiLight assay. Data are mean +s.e.m. Circles 
correspond to each tested individual. Analysis of each sample was performed 
in triplicate. b, Patient fibroblasts showed reduced necroptosis signals after 
SM-164, Z-VAD-FMK and LPS stimulation compared to six paediatric unaffected 
controls. Patient and control fibroblasts were treated with indicated 
stimulation for 6 h (concentrations as ina). Cells were lysed and analysed by 
immunoblotting with indicated antibodies. For gel source data, see 
Supplementary Fig. 1. Results are representative of three independent 
experiments. c, Patient fibroblasts showed reduced necroptosis signals after 
SM-164, Z-VAD-FMK and TNF stimulation compared witha paediatric 
unaffected control. Patient and control fibroblasts were treated as indicated 
for 6h (concentrations as ina). Cells were lysed and analysed by 
immunoblotting with indicated antibodies. For gel source data, see 
Supplementary Fig. 1. Results are representative of three independent 


experiments. d, The reduction of RIPK1 was rescued by Nec-1s in patient 
fibroblasts. Fibroblasts were treated as indicated for 24 h (concentrations asin 
a). NSA, 0.5 uM necrosulfonamide. Cell lysates were analysed by 
immunoblotting using indicated antibodies. For gel source data, see 
Supplementary Fig. 1. Results are representative of three independent 
experiments. e, Patient fibroblast showed reduced transcription levels of 
RIPK1 compared to five paediatric unaffected controls. The mRNA levels of 
RIPK1 were measured by qPCR. Dataare mean +s.e.m. Circles correspond to 
each tested individual. Analysis of each sample was performed in triplicate. 

f, Patient fibroblasts exhibited reduced TNFR1 expression at baseline 
compared to five paediatric unaffected controls. For gel source data, see 
Supplementary Fig. 1. Results are representative of three independent 
experiments. g, Patient fibroblasts displayed downregulation of genes 
involved in cell death compared with three paediatric unaffected controls. 
Analysis of each sample was performed in duplicate. h, Patient fibroblasts were 
resistant to erastin- or RSL3-induced ferroptosis compared with three 
paediatric unaffected controls. Cell death was measured by ToxiLight assay. 
Data are mean +s.e.m. Circles correspond to each tested individual. Analysis of 
each sample was performed in triplicate. 


Extended Data Table 1| Clinical manifestations of patients 


with RIPK1 variants 
Patient P1 P2 P3 P4 PS 
Variant p.D324V. ~— p.D324H p.D324H = p.D324H_~—p.D324H 
Age 2y 35 y 14y 12y 10y 
Gender M F M M M 
Age at onset 2mo 6 mo 1 mo - 1 mo 
Recurrent fevers + + b - + 
Fever frequency 8-10d 10-15 d 15d - 2-3d 
Fever duration 3-5d 3h-2d 3h-2d - 3h-1d 


Lymphadenopathy 


+ 


Splenomegaly 


+ 


Hepatomegaly 


+ 


Microcytic anemia 
Abdominal pain 


Article 


Extended Data Table 2 | The count and percentage of T and B 
cells in patient 1 


Controls Patient 1 
Gender M 
Age at evaluation 1-4 y (n = 289) 2y 
Counts (cells pl") 
CD4 Helper T, naive 472-1760 3887 
CD4 Helper T, central memory 212-735 1570 
CD4 Helper T, effector memory 15-87 145 
CD4 TEMRA 0-22 16 
CD8 Cytotoxic T, naive 356-1095 1912 
CD68 Cytotoxic T, central memory 56-406 825 
CD8 Cytotoxic T, effector memory 6-145 89 
CD8 Cytotoxic T, TEMRA 9-440 434 
DNT TCR oB+ 9-57 103 # 
CD19 Naive B 323-1108 9216 : 
CD19 Memory B 26-124 103 
CD19 Transitional B 35-172 1504 
CD19 Plasmablasts 4-63 61 
Percentages (%) 
CD4 Helper T, naive 46.14-84.40 69.2 
CD4 Helper T, central memory 13.88-48.12 27.9 
CD4 Helper T, effector memory 0.94-6.46 2.6 
CD4 TEMRA 0.00-1.36 0.3 
CD8 Cytotoxic T, naive 36.80-83.16 58.7 
CD8 Cytotoxic T, central memory 5.18-31.66 25.3 
CD8 Cytotoxic T, effector memory 0.70-11.22 28 
CD8 Cytotoxic T, TEMRA 0.84-33.02 13.3 
DNT TCR a+ 0.37-1.80 1.04 
CD19 Naive B 65.54-86.62 95.4 t 
CD19 Memory B 2.98-14.18 11 
CD19 Transitional B §.24-17.22 15.6 
CD19 Plasmablasts 0.50-7.06 0.6 


The relative and absolute numbers of lymphocyte subpopulations are determined by samples 
from 289 age-matched unaffected controls”. The whole blood sample from patient P1 was 
obtained during a fever episode. 


Extended Data Table 3 | Cytokine levels in serum of patients 


from family 2 
No. Cytokine P2 P3 P4 P5 C1 c2 c3 c4 
1 IFN-y 16.99 9.67 39.3 4.93 1.66 <0.97 <0.97 1.66 
2 IL-6 1.62 1.79 9.83 1.24 <1.03 <1.03 <1.03 <1.03 
3 IL-8 1.91 1.69 4.52 6.01 1.28 4.66 <1.07 1.11 
4 P40 89.31 1995 5218 303 2037 1659 1821 12.87 


Clis the mother of P2; C2 is the fourth son of P2; C3 and C4 are adult unaffected controls. 
The serum samples were obtained when all of the patients were during remission. Cytokine 


concentrations are pg ml". 


Xiaomin Yu, Xiaochuan Wang, Junying Yuan, 
i } a Ve [¢ S¢ arc Corresponding author(s): Qing Zhou 


Last updated by author(s): Oct 10, 2019 


Reporting Summary 


Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency 
in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist. 


Statistics 


For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


O A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


O For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection Flow cytometry, intracellular cytokine staining and BD™ Cytometric Bead Array data were acquired on a FACS Canto II cytometer (BD 
Biosciences). qPCR data were acquired on Applied Biosystems 7500 Real-Time PCR System (Life Technologies) and ROCHE 480II. 
Immunoblot images were scanned by FluorChem E (ProteinSimple) and scanner (Epson Perfection V700 Photo). Chemiluminiscence data 
were collected by the multimode plate reader (Bio Tek). Green fluorescent images were taken by a Leica fluorescence microscope (LEICA 
DMI 6000B). NanoString assay was conducted by the nCounter Analysis System (NanoString Technologies). RNA sequencing and Whole 
exome sequencing were carried out on an Illumina Novaseq system. Single cell RNA sequencing were carried out on 10x Genomics 
Chromium machine for single-cell capture and cDNA preparation and on an Illumina Novaseq system for sequencing. 


Data analysis FlowJo (Tree Star) (Flow cytometry and intracellular cytokine staining analysis); GraphPad Prism 8, IBM SPSS Statistics 25 and Microsoft 
Excel (Graphs, statistics); Image J (image analysis); nSolver 4.0 (NanoString assay); Cell Ranger V3.0.1 and Seurat R package (single cell 
RNA sequencing analysis); R 3.5.2 (data analysis and graph); DESeq2 R package (RNA sequencing analysis); FCAP Array V3.0 (BD 
Biosciences) (cytokines data measured by BD™ Cytometric Bead Array); ANNOVAR (2018Apr16) (mutation effect predictions). 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 


All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 
- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


Source data for graphs are provided with the paper. Uncropped gels raw data are shown in Supplementary Fig. 1. Other source data that support the findings of this 
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study are available from the corresponding author upon reasonable request. 
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Data exclusions o data was excluded from analysis. 


Replication n Fig. 2d, Fig. 4g, Extended Data Fig. 4g, Extended Data Fig. 5j and Extended Data Fig. 6g, RNA sequencing analysis of each sample was 
performed in duplicate. In Fig. 2b, c, Extended Data Fig. 2d-g, Extended Data Fig. 3a, b, Extended Data Fig. 4h, Single cell RNA sequencing 
analysis of patient sample was performed in duplicate and healthy control (C1) was performed once. In Fig. 3d and Extended Data Fig. 4d, 
anoString analysis of patient and healthy controls PBMCs were performed once. In Fig. 4i, the immunofluorescence of cytosolic ROS (green 
foci) in fibroblasts was performed in duplicate. The other data are representative of three independent biological replicates. 


Randomization ultiple age and gender matched healthy controls used in the experiment were randomly selected. Patient samples were taken multiple 
times and used for independent biological replicate experiments. 


Blinding Blinding was not possible as the authors who performed the experiment also analyzed the data. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 


Animals and other organisms 


Human research participants 


Clinical data 


Antibodies 


Antibodies used Western Blot: 
B-Actin (Cell Signaling Technology, 4970, clone 13E5, lot 15, 1:1000), B-Tubulin (Cell Signaling Technology, 86298, clone D3U1W, 
lot 1, 1:1000), GAPDH (Cell Signaling Technology, 5174, clone D16H11, lot 6, 1:1000), RIPK1 (Cell Signaling Technology, 3493, 
clone D94C12, lot 3, 1:1000), p-RIPK1 (Ser166) (Cell Signaling Technology, 65746, clone D1L3S, lot 02, 1:1000), MLKL (Cell 
Signaling Technology, 14993, clone D2I6N, lot 3, 1:1000), p-MLKL (Ser358) (Cell Signaling Technology, 91689, clone D6H3V, lot 3, 
1:1000), p-MLKL (Ser345) (Cell Signaling Technology, 37333, clone D6E3G, lot 2, 1:1000), p65 (Cell Signaling Technology, 8242, 
clone D14E12, lot 9, 1:1000), p-p65 (Ser65) (Cell Signaling Technology, 3033, clone 93H1, lot 16, 1:1000), IKKa (Cell Signaling 
Technology, 11930, clone 3G12, lot 5, 1:1000), IKK (Cell Signaling Technology, 2370, clone D30C6, lot 4, 1:1000), p-IKKa/B 
(Ser176/180) (Cell Signaling Technology, 2697, clone 16A6, lot 19, 1:1000), IkBa (Cell Signaling Technology, 4814, clone L35A5, 
lot 17, 1:1000), p-IkBa (Ser32) (Cell Signaling Technology, 2859, clone 14D4, lot 18, 1:1000), p38 (Cell Signaling Technology, 
8690, clone D13E1, lot 6, 1:1000), p-p38 (Thr180/Tyr182) (Cell Signaling Technology, 4511, clone D3F9, lot 13, 1:1000), TNFR1 
(Cell Signaling Technology, 3736, clone C25C1, lot 2, 1:1000), caspase-8 (Cell Signaling Technology, 4790, clone D35G2, lot 2, 
1:1000), cl-caspase-8 (Cell Signaling Technology, 8592, clone D5B2, lot 2, 1:1000), caspase-3 (Cell Signaling Technology, 9662, 
polyclonal, lot 18, 1:1000), cleaved-caspase-3 (Asp175) (Cell Signaling Technology, 9661, polyclonal, lot 18, 1:1000), HA-Tag (Cell 
Signaling Technology, 3724, clone C29F1, lot 8, 1:1000), SLC7A11 (Cell Signaling Technology, 12691, clone D2M7A, lot 1, 1:1000), 
Cyclophilin A (Abcam, ab41684, polyclonal, lot GR3201186-1, 1:1000), GPX4 (Abcam, ab125066, clone EPNCIR144, lot 
GR251529-29, 1:1000), Lamp-2A (Abcam, ab125068, clone EPR4207(2), lot GR7472-1, 1:1000), COX2 (Abcam, ab15191, 
polyclonal, 1:1000), p53 (Abcam, ab32389, clone E26, 1:1000), FADD (Santa Cruz Biotechnology, sc-6036, clone M-19, 1:1000), 
ACSL4 (Santa Cruz Biotechnology, sc-365230, clone F-4, lot G1116, 1:1000), HSC70 (Proteintech Group, 10654-1-AP, polyclonal, 
1:1000), HSP90 (Affinity, BF9107, clone AFB5588, lot 55e0886, 1:1000), p-RIPK1 (Ser166) (Biolynx, BX60008, clone YJY-1-5, 


Validation 


1:1000), MLKL (Mus) (home made, 1:1000). 


Flow cytometry and intracellular cytokine staining: 

CD3-APC-H7 (BD Biosciences, 560176, clone SK7, lot 9022755, 1:20), CD14-PE-CY7 (BD Biosciences, 557742, clone M5E2, lot 
3291733, 1:20), CD4-FITC (BD Biosciences, 555346, clone RPA-T4, lot 8037703, 1:20), CD19-BB700 (BD Biosciences, 566396, 
clone SJ25C, lot 8130761, 1:20), IL8-PE (Biolegend, 511408, clone E8N1, lot B230067,1:20), IL10-BV421 (Biolegend, 501421, 

clone JES3-9D7, lot B256252, 1:20), IL6-Percp-cy5.5 (Biolegend, 501118, clone MQ2-13A5, lot B267970, 1:20), TNF-V450 (BD 
Biosciences, 561311, clone Mab11, lot 7209966, 1:20). 


Alexa Fluor® 647-conjugated antibody against STAT3 phosphorylated at Y705 (BD Biosciences, 557815, clone 4/P-STAT3, lot 
7346960, 1:10), Alexa Fluor® 488-conjugated antibody against NF-kB p65 phosphorylated at S529 (BD Biosciences, 558421, clone 
K10-895.12.50, lot 8241506, 1:10), Alexa Fluor® 488-conjugated antibody against p38 phosphorylated at T180/Y182 (BD 
Biosciences, 612594, clone 36/p38 (pT180/pY182), lot 8032972, 1:10). 


Isotype control antibodies: 
Alexa Fluor® 647-conjugated mouse IgG1k (BD Biosciences, 557714, clone MOPC-21, lot 7076782, 1:20), Alexa Fluor® 488- 
Mouse IlgG1k (BD Biosciences, 557782, clone MOPC-21, lot 7102576, 1:20), BV421-Rat IgG1k (Biolegend, 400439, clone 
RTK2071, lot B272547, 1:20), PE- Mouse IgG1k (Biolegend, 400140, clone MOPC-21, lot B272822, 1:20), Percp-cy5.5-Rat IgG1k 
(Biolegend, 400426, clone RTK2071, lot B255186, 1:20), V450- Mouse IgG1k (BD Biosciences, 560373, clone MOPC-21, lot 
6021915, 1:20). 


All antibodies except MLKL (Mus, home made) are commercially available and have been verified by the manufacturers 
according to the immunoblots and/or images on their websites. 

B-Actin: https://www.cst-c.com.cn/products/primary-antibodies/b-tubulin-d3u1w-mouse-mab/86298, 

B-Tubulin: https://www.cst-c.com.cn/products/primary-antibodies/b-tubulin-d3u1w-mouse-mab/86298, 

GAPDH: https://www.cst-c.com.cn/products/primary-antibodies/gapdh-d16h11-xp-rabbit-mab/5174, 

RIPK1: https://www.cst-c.com.cn/products/primary-antibodies/rip-d94c12-xp-rabbit-mab/3493, 

p-RIPK1 (Ser166): https://www.cst-c.com.cn/products/primary-antibodies/phospho-rip-ser166-d113s-rabbit-mab/65746, 
LKL: https://www.cst-c.com.cn/products/primary-antibodies/mlkl-d2i6n-rabbit-mab/14993, 

p-MLKL (Ser358): https://www.cst-c.com.cn/products/primary-antibodies/phospho-mlkl-ser358-d6h3v-rabbit-mab/91689, 
p65: https://www.cst-c.com.cn/products/primary-antibodies/nf-kb-p65-d14e12-xp-rabbit-mab/8242, 

p-p65 (Ser65): https://www.cst-c.com.cn/products/primary-antibodies/phospho-nf-kb-p65-ser536-93h1-rabbit-mab/3033, 
KKa: https://www.cst-c.com.cn/products/primary-antibodies/ikka-3g12-mouse-mab/11930, 

KKB: https://www.cst-c.com.cn/products/primary-antibodies/ikkb-d30c6-rabbit-mab/8943, 

p-IKKa/B (Ser176/180): https://www.cst-c.com.cn/products/primary-antibodies/phospho-ikka-b-ser176-180-16a6-rabbit- 
mab/2697, 
«Ba: https://www.cst-c.com.cn/products/primary-antibodies/ikba-I35a5-mouse-mab-amino-terminal-antigen/4814, 

p-IkBa (Ser32): https://www.cst-c.com.cn/products/primary-antibodies/phospho-ikba-ser32-14d4-rabbit-mab/2859, 

p38: https://www.cst-c.com.cn/products/primary-antibodies/p38-mapk-d13e1-xp-rabbit-mab/8690, 

p-p38 (Thr180/Tyr182): https://www.cst-c.com.cn/products/primary-antibodies/phospho-p38-mapk-thr180-tyr182-d3f9-xp- 
rabbit-mab/4511, 

TNER1: https://www.cst-c.com.cn/products/primary-antibodies/tnf-r1-c25c1-rabbit-mab/3736, 

HA-Tag: https://www.cst-c.com.cn/products/primary-antibodies/ha-tag-c29f4-rabbit-mab/3724, 

p-MLKL (Ser345): https://www.cst-c.com.cn/products/antibody-conjugates/phospho-mlk|-ser345-d6e3g-rabbit-mab/37333, 
p-MLKL (Ser358): https://www.cst-c.com.cn/products/primary-antibodies/phospho-mlkl-ser358-d6h3v-rabbit-mab/91689, 
caspase-8: https://www.cst-c.com.cn/products/primary-antibodies/caspase-8-1c12-mouse-mab/9746, 
caspase-3:https://www.cst-c.com.cn/products/primary-antibodies/caspase-3-antibody/9662, 

cleaved caspase-3 (Asp175): https://www.cst-c.com.cn/products/primary-antibodies/cleaved-caspase-3-asp175-antibody/9661, 
SLC7A11: https://www.cst-c.com.cn/products/primary-antibodies/xct-slc7a11-d2m7a-rabbit-mab/12691, 

HSP90: http://affbiotech.cn/goods-4354-BF9107-HSP90+beta+Antibody.html, 

Cyclophilin A: https://www.abcam.com/cyclophilin-a-antibody-ab41684.html, 

GPX4: https://www.abcam.com/glutathione-peroxidase-4-antibody-epncir144-ab125066.html, 

Lamp-2A: https://www.abcam.com/lamp2a-antibody-epr42072-lysosome-marker-ab125068.html, 

COX2: https://www.abcam.com/cox2--cyclooxygenase-2-antibody-ab15191.html, 

p53: https://www.abcam.com/p53-antibody-e26-ab32389.html, 

FADD: https://www.scbt.com/scbt/zh/product/fadd-antibody-m-19, 

ACSL4: https://www.scbt.com/scbt/product/acsl4-antibody-a-5, 

HSC70: https://www.ptglab.com/products/HSPA8-Antibody-10654-1-AP.html, 

p-RIPK1 (Ser166) (Biolynx) : http://www.biolynx.cn/product/seDetail/722, 

CD3-APC-H7: http://www.bdbiosciences.com/eu/applications/research/t-cell-immunology/th-1-cells/surface-markers/human/ 
apc-h7-mouse-anti-human-cd3-sk7-also-known-as-leu-4/p/560176, 

CD14-PE-CY7: http://www.bdbiosciences.com/eu/applications/research/stem-cell-research/hematopoietic-stem-cell-markers/ 
human/negative-markers/pe-cy7-mouse-anti-human-cd14-m5e2/p/557742, 

CD4-FITC: http://www.bdbiosciences.com/eu/applications/research/t-cell-immunology/th-1-cells/surface-markers/human/fitc- 
mouse-anti-human-cd4-rpa-t4/p/555346, 

CD19-BB700: http://www.bdbiosciences.com/eu/reagents/research/antibodies-buffers/immunology-reagents/anti-human- 
antibodies/cell-surface-antigens/bb700-mouse-anti-human-cd19-sj25c1-also-known-as-sj25-c1/p/566396, 

IL8-PE: https://www.biolegend.com/en-us/products/pe-anti-human-il-8-antibody-4131, 

IL10-BV421: https://www.biolegend.com/en-us/products/brilliant-violet-421-anti-human-il-10-antibody-7156, 
IL6-Percp-cy5.5: https://www.biolegend.com/en-us/products/percpcyanine55-anti-human-il-6-antibody-13095, 

TNF-V450: http://www.bdbiosciences.com/eu/applications/research/t-cell-immunology/th-1-cells/intracellular-markers/ 
cytokines-and-chemokines/human/v450-mouse-anti-human-tnf-mab11/p/561311, 

Alexa Fluor® 647-conjugated antibody against STAT3 phosphorylated at Y705: http://www.bdbiosciences.com/eu/reagents/ 
research/antibodies-buffers/cell-biology-reagents/cell-biology-antibodies/alexa-fluor-647-mouse-anti-stat3-py705-4p-stat3/ 
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Eukaryotic cell lines 


p/557815, 

Alexa Fluor® 488-conjugated antibody against NF-KB p65 phosphorylated at $529: http://www.bdbiosciences.com/eu/ 
applications/research/intracellular-flow/intracellular-antibodies-and-isotype-controls/anti-human-antibodies/alexa-fluor-488- 
mouse-anti-nf-b-p65-ps529-k10-8951250/p/558421, 

Alexa Fluor® 488-conjugated antibody against p38 phosphorylated at T180/Y182: http://www.bdbiosciences.com/eu/ 
applications/research/intracellular-flow/intracellular-antibodies-and-isotype-controls/anti-rat-antibodies/alexa-fluor-488-mouse- 
anti-p38-mapk-pt180py182-36p38-pt180py182/p/612594, 

Alexa Fluor® 647-conjugated mouse IgG1k: http://www.bdbiosciences.com/eu/reagents/research/antibodies-buffers/ 
immunology-reagents/anti-human-antibodies/cell-surface-antigens/alexa-fluor-647-mouse-igg1-isotype-control-mopc-21/ 
p/557714, 

Alexa Fluor® 488- Mouse IgG1k: http://www.bdbiosciences.com/eu/reagents/research/antibodies-buffers/cell-biology-reagents/ 
isotype-controls/alexa-fluor-488-mouse-igg1-isotype-control-mopc-21/p/557782, 

PE- Mouse IgG1k: https://www.biolegend.com/en-us/products/pe-mouse-igg1--kappa-isotype-ctrl-icfc-3032, 

Percp-cy5.5-Rat lgG1k: https://www.biolegend.com/en-us/products/percp-cy5-5-rat-igg1--kappa-isotype-ctrl-4203, 

V450- Mouse IgG1k: http://www.bdbiosciences.com/eu/reagents/research/antibodies-buffers/immunology-reagents/anti- 
human-antibodies/cell-surface-antigens/v450-mouse-igg1-isotype-control-mopc-21/p/560373. 

Data are provided per assurance by each supplier. The commercial antibodies are well used and reported in lots of previous 
publications. 

The MLKL (Mus) (home made) has been validation by Wu, Z. et al. Chaperone-mediated autophagy is involved in the execution of 
ferroptosis. Proceedings of the National Academy of Sciences 116, 2996, doi:10.1073/pnas.1819728116 (2019). 


Policy information about cell lines 


Cell line source(s) 


Authentication 


Mycoplasma contamination 


Commonly misidentified lines 
(See ICLAC register) 


HEK293T cell line was from the American Type Culture Collection. Ripk1 gene knock-out MEFs were established from Ripk1-/- 
mice. MEFs derived from D325A knockin mice were kindly contributed by Jianke Zhang. 


Cell line from ATCC has been authenticated by ATCC. The primary lines are cultured for limited number of passages. 
Cell lines tested negative for Mycoplasma contamination. 


No commonly misidentified cell lines were used. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics 


Recruitment 


Ethics oversight 


The patients characteristics have been described in Extended Data Table 1. Healthy controls are less than 10 years old and they 
had no symptoms of inflammation when sampling. 


Patient samples were obtained from patients with early onset autoinflammatory disease but without clear genetic diagnosis. And 
multiple age and gender matched healthy controls used in the experiment were randomly selected. 


Patient P1 was evaluated under protocols approved by the Institutional Review Board (IRB) by Children’s Hospital of Fudan 
University (Shanghai, China). Patients P2, P3, P4 and P5 and their unaffected family members were evaluated at McMaster 
Children’s Hospital (Ontario, Canada), and the Hospital for Sick Children (Toronto, Canada). Signed consent for their clinical 
information to be shared and for research samples to be sent to Boston Children's Hospital (Boston, USA) was obtained. Ethics 
clearance was received from the Institutional Review Board (IRB) at Boston Children's Hospital (Boston, USA) and from Western 
nstitutional Review Board. All relevant ethical regulations were followed. All patients and/or substitute decision markers 
provided written informed consent. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Flow Cytometry 
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ots 


Confirm that: 


Methodology 


Sample preparation 


The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). 
The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers). 
All plots are contour plots with outliers or pseudocolor plots. 


A numerical value for number of cells or percentage (with statistics) is provided. 


We used EDTA-anticoagulated peripheral whole blood from patients and health donors. PBMCs from patients and healthy 
donors were separated by lymphocyte separation medium (LSM) and SepMate tubes (Stemcell) according to the manufacturer's 
instructions. 
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Instrument All events were acquired on a FACS Canto II cytometer. 
Software All events were analyzed with FlowJo (10.0). 


Cell population abundance _—_1.0E6 of PBMCs were used for each test. Coutess II FL (Thermo Fisher) and 0.4% Typan Blue were used to determine the viability 
and number of cells. The viability of PBMCs is above 85% for each sample. 


Gating strategy Surface markers CD3, CD4, CD14 and CD19 were used to gate total T cells, CD4+T cells, monocytes and total B cells respectively. 
A figure exemplifying the gating strategy is provided in the Supplementary Information. 


Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 
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Metastasis requires cancer cells to undergo metabolic changes that are poorly 
understood’. Here we show that metabolic differences among melanoma cells 
confer differences in metastatic potential as a result of differences in the function of 
the MCT1 transporter. In vivo isotope tracing analysis in patient-derived xenografts 
revealed differences in nutrient handling between efficiently and inefficiently 
metastasizing melanomas, with circulating lactate being a more prominent source of 
tumour lactate in efficient metastasizers. Efficient metastasizers had higher levels of 
MCT1, and inhibition of MCT1 reduced lactate uptake. MCT1 inhibition had little effect 
onthe growth of primary subcutaneous tumours, but resulted in depletion of 
circulating melanoma cells and reduced the metastatic disease burden in patient- 
derived xenografts and in mouse melanomas. In addition, inhibition of MCT1 
suppressed the oxidative pentose phosphate pathway and increased levels of reactive 
oxygen species. Antioxidants blocked the effects of MCT1 inhibition on metastasis. 
MCTI1"®" and MCT1”™ cells from the same melanomas had similar capacities to form 
subcutaneous tumours, but MCT1"®" cells formed more metastases after intravenous 


injection. Metabolic differences among cancer cells thus confer differences in 
metastatic potential as metastasizing cells depend on MCT1 to manage oxidative 


stress. 


Metastasis is a very inefficient process in which few disseminated 
cancer cells survive’. One factor that limits metastasis insome cancers, 
including melanoma, is oxidative stress” °. Melanoma cells experience 
increased oxidative stress during metastasis, and must undergo meta- 
bolic changes to survive, including increased dependence on the folate 
pathway?—a major source of NADPH for oxidative stress resistance”®. 
Cells use NADPH to regenerate glutathione (GSH), a buffer against 
oxidative stress. GSH and other antioxidants promote cancer initiation 
and progression?” ”. This suggests that pro-oxidant therapies would 
inhibit the progression of some cancers, although they may promote 
the initiation or progression of others”. 

Lactate synthesis and export from highly glycolytic cells is neces- 
sary to remove excess acid and to sustain glycolysis“. Lactate was, 
thus, considered a waste product that must be eliminated by cancer 
cells despite the fact that some cancer cells take up and metabolize 
lactate in culture®"®. Lung cancers” and pancreatic cancers'* use MCT1 
to transport lactate from the circulation into the tumour, with some of 
the carbon from lactate supplying the tricarboxylic acid (TCA) cycle. 
Enhanced lactate transport correlates with worse outcomes”, raising 
the question of whether lactate consumption is a biomarker of more 
aggressive cancers or whether it promotes cancer progression. 

Lactate is transported across the cytoplasmic membrane mainly 
by MCT1 and MCT4”. These transporters enable bidirectional, pas- 
sive transport of lactate and related monocarboxylates, including 


pyruvate>?°°?°, Although MCT1 transports several carboxylates, its 
main physiological function in vivo is lactate import as lactate is at least 
tenfold more abundant than other carboxylates in the fed state”. None- 
theless, the directionality of transport by MCT transporters depends 
on lactate and proton concentration gradients. MCT1 inhibition can 
induce cell death by inhibiting glycolysis as a result of the failure to 
export lactate in culture”, and can suppress xenograft growth in mice™ 
and cancer cell migration in culture”. However, most studies of MCT 
function were performed in culture, in which cells tend to be more 
highly glycolytic than in vivo”, raising the question of whether MCTs 
regulate cancer progression in vivo. 


Efficient metastasizers take up more lactate 


Efficient metastasizers give rise to circulating cancer cells and distant 
macrometastases in patients and after xenografting in NOD-SCID 
[l2rg’ (NSG) mice, whereas inefficient metastasizers do not give rise 
to detectable cancer cells in the blood and metastasize more slowly 
in mice and in patients* (Extended Data Fig. 1a). We subcutaneously 
injected efficiently metastasizing (from patients M405, M481, M487 
and UT10) and inefficiently metastasizing (from patients M715, UM17, 
UM22, UM43, UM47, M498, M528, M597 and M610) melanomas into 
NSG mice. We used established techniques” to infuse “C-labelled nutri- 
ents into these mice when the tumours reached approximately 2 cmin 


‘Children’s Research Institute and Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, USA. Department of Dermatology, University of Texas Southwestern 
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Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA. Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern 
Medical Center, Dallas, TX, USA. *e-mail: ralph.deberardinis@UTSouthwestern.edu; sean.morrison@UTSouthwestern.edu 
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Fig. 1| Efficiently metastasizing melanomas exhibit enhanced lactate 
uptake in vivo. Isotope tracing in primary subcutaneous tumours xenografted 
in NSG mice with efficiently (M405, M481, M487 and UT10) and inefficiently 
(M715, UM17, UM22, UM43, UM47, M498, M528, M597 and M610) metastasizing 
melanomas. The number of mice or tumours per treatmentis indicated. 

a,b, Glucose m+ 6asa fraction of the glucose pool (a) and enrichment of other 
metabolites normalized to m + 6 glucose (b) insubcutaneous tumours after 
infusion of [U-°C]glucose. c, The 3PG m +3 fraction in subcutaneous tumours 
(SQ) and lactate m +3 fraction in the plasma of mice infused with [U-°C]glucose 


diameter, then examined labelling in metabolites extracted from the 
blood and tumours. Infusion of uniformly labelled °C-glutamine ([U- 
8C]glutamine) enriched the circulating glutamine pool and produced 
no differences in labelling between efficient and inefficient metastasiz- 
ers (Extended Data Fig. 1b, c). Infusion of [U-C]glucose modestly but 
significantly increased glucose enrichments in inefficient metastasizers 
compared with efficient metastasizers (Fig. 1a), despite no differences 
in circulating glucose (Extended Data Fig. 1d, e). For this reason, we 
normalized glucose-derived metabolites in the tumour to glucose 
m +6. After this normalization, the labelling of 3-phosphoglycerate 
(3PG) was similar between the tumour types, but the efficiently metas- 
tasizing tumours had increased labelling of lactate compared with 3PG 
(Fig. 1b). In efficient, but not inefficient, metastasizers, the absolute 
enrichment in circulating lactate also exceeded the enrichment in 
tumour 3PG (Fig. 1c). These labelling features in efficient metastasiz- 
ers are similar to some human lung cancers, in which excess lactate 
labelling relative to 3PG was explained by the uptake of lactate derived 
from infused glucose”. 

Next, we infused [U-¥C]lactate using conditions that produced 
steady-state labelling and abundance in the blood (Extended Data 
Fig. le, f), and found no differences in the abundance of tumour lactate 
between efficient and inefficient metastasizers (Fig. 1d). To account 
for labelling resulting from the transfer of °C from lactate to glucose 
by gluconeogenesis, followed by glucose uptake and glycolysis in the 
tumour, we normalized metabolite labelling to 3PG, which presumably 
arises from glycolysis. Lactate enrichment was higher in efficient com- 
pared with inefficient metastasizers, and exceeded enrichment in 3PG 
or pyruvate (Fig. le). These data suggest that efficient metastasizers are 
better than inefficient metastasizers at taking up circulating lactate. 
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(20 experiments). d, Tumour lactate concentration (3 experiments). 

e, Enrichment of metabolites normalized to 3PG m+3in subcutaneous 
tumours after [U-°C]lactate infusion (23 experiments). f, Isotope labelling 
after [2-*H]lactate infusion (3 experiments). Data are mean +s.d. Statistical 
significance was assessed using f-tests (a, f), paired t-tests (c), log,-transformed 
t-tests to compare efficient versus inefficient melanomas or Wilcoxon tests to 
compare metabolites (b, e). Multiple comparisons were adjusted using the 
Holm-Sidak’s method (b,c, e, f). 


Efficient metastasizers also had higher enrichments in metabolites 
related to the TCA cycle (citrate, glutamate and malate) (Fig. le), which 
suggests that °C from lactate was transferred to the TCA cycle. Both 
efficiently and inefficiently metastasizing melanomas expressed lactate 
dehydrogenase (LDH) A and B, indicating their capacity to metabolize 
lactate (Extended Data Fig. 1i). 

To verify lactate uptake directly, we infused [2-*H]lactate. Exchanges 
between lactate and pyruvate transfer 7H to NAD‘, resulting in unla- 
belled pyruvate (Extended Data Fig. 1h); thus, the appearance of label in 
the tumours indicates the uptake of lactate, not pyruvate”. As expected, 
we observed label intumour lactate but not pyruvate or alanine (Fig. If). 
Lactate labelling was higher in efficient than in inefficient metastasizers 
(Fig. 1f), despite similar labelling in the blood (Extended Data Fig. 1g). 
Efficient metastasizers also contained labelled malate (Fig. 1f), which 
could arise from the transfer of 2H from NAD?H to malate!”* (Extended 
Data Fig. 1h). 


Higher MCT1in efficient metastasizers 

We observed consistently higher levels of MCT1in efficient metastasiz- 
ers as compared to inefficient metastasizers by western blot analysis 
(Fig. 2a; see Extended Data Fig. 2a for quantification). We confirmed this 
difference using two other anti-MCT1 antibodies by immunofluores- 
cence analysis (Extended Data Fig. 2e-j) and flow cytometry (Fig. 2d, e, 
Extended Data Fig. 2c; see Extended Data Fig. 2d for quantification). The 
difference in surface MCTI staining between efficient and inefficient 
metastasizers by flowcytometry was particularly notable. Immunofluo- 
rescence analysis suggested that MCT1 staining tended to be associated 
with the cell surface in efficient metastasizers (Extended Data Fig. 2)), 
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Fig. 2| MCT1inhibition selectively impairs metastasis inhuman and mouse 
melanomas. a-c, Western blot analysis of MCT1 (a), MCT2 (b) and MCT4 (c) in 
three efficiently (M405, M481and UT10) and four inefficiently (M498, M528, 
M597 and M610) metastasizing xenografted melanomas. Wild-type (WT) 
HCC15 cells were positive controls for MCT1and MCT4; MCT1- and MCT4- 
deficient (KO) HCC1S5 cells were negative controls for MCT1and MCT4, 
respectively. MCF7 cells were a positive control for MCT2. The data are 
representative of four (a) or two (b, c) experiments. d, e, Flow cytometric 
analysis of MCT1 surface expression in inefficiently (d) and efficiently (e) 
metastasizing melanomas. f, Enrichment of lactate m + 3 normalized to 3PG 
m+3inxenografted tumours after treatment with the MCT1 inhibitor AZD3965 
or DMSO control and infusion of [U-?C]lactate (two experiments per 


but more diffusely cytoplasmic in inefficient metastasizers (Extended 
Data Fig. 2i). 

The expression of MCT1 and CD147 (a co-chaperone of MCT1”) did 
not differ between primary and metastatic tumours (Extended Data 
Fig. 3a—g), consistent with a previous study”. We did not detect MCT2in 
any of the melanomas we studied (Fig. 2b). MCT4 was expressed (Fig. 2c; 
see Extended Data Fig. 2b for quantification), but did not consistently 
differ between primary and metastatic tumours (Extended Data Fig. 3b). 


MCT1is required during metastasis 

To test whether MCTI1 mediates lactate uptake by melanoma cells, we 
transplanted efficiently metastasizing melanomas from three patients 
subcutaneously into NSG mice, and then treated half of the mice for 
7 days with the selective MCT1 inhibitor AZD3965 (30 mg kg‘ day”), 
which does not have activity against MCT4”*. We infused [U-“C]lactate 
and measured the fractional enrichment in lactate relative to 3PG in 
the tumours. In all three melanomas, AZD3965 treatment significantly 
reduced lactate labelling, to the point that lactate and 3PG were equiva- 
lently labelled, consistent with the labelled lactate arising from gly- 
colysis rather than lactate uptake (Fig. 2f). Therefore, MCT1 mediates 
lactate uptake in efficient metastasizers. 

AZD3965 treatment did not significantly alter the levels of MCT1 
(Extended Data Fig. 3h, i), CD147 (Extended Data Fig. 3j, k), B, integrin 
(Extended Data Fig. 3n, 0) or CD98 (Extended Data Fig. 31, m) onthe 
surface of melanoma cells. In addition, AZD3965 treatment did not 


melanoma). The number of mice per treatment is indicated. g-i, Growth of 
subcutaneous tumours (g) in mice treated with AZD3965 (AZD) or DMSO 
control; the frequency of circulating melanoma cells in the blood (h); and 
metastatic disease burden based on bioluminescence imaging (i). Datain hand 
ireflect one (UT10) or two experiments per melanoma, but only one 
representative experiment per melanomais showning.j, k, Growth of 
subcutaneous tumours (j) and metastatic disease burden at end point by 
bioluminescence imaging (k) in mice transplanted with YUMMI1.7, YUMM3.3 or 
YUMMS.2 mouse melanomas and treated with AZD3965 or DMSO control (two 
experiments per melanoma). Data are mean + s.d. Statistical significance was 
assessed using ¢-tests (f), nparLD (g), mixed-effects analysis (j) or 
Mann-Whitney tests (h, i,k). NS, not significant. 


significantly alter the levels of IKKa (Extended Data Fig. 3p-r) or IKKB 
(Extended Data Fig. 3s—u), or the epithelial-mesenchymal transition 
markers E-cadherin (Extended Data Fig. 4a), N-cadherin (Extended 
Data Fig. 4b) or vimentin (Extended Data Fig. 4c). 

To test whether MCT1inhibition affected primary tumour growth or 
metastasis, we subcutaneously transplanted efficiently metastasizing 
melanoma cells from three patients into NSG mice. Once tumours 
were palpable, we treated every other day with AZD3965”. AZD3965 
had little effect on the growth of subcutaneous tumours (Fig. 2g) but 
substantially reduced the frequency of circulating melanoma cells in 
the blood (Fig. 2h), and metastatic disease burden in the same mice 
(Fig. 2i, Extended Data Fig. 5). 

We also infected melanoma cells from three patients with scram- 
bled control short hairpin RNA (shRNA) or with shRNAs against MCT1 
(also known as SLCI6A1) (Extended Data Fig. 6a, b; these shRNAs did 
not affect MCT4 expression) and then transplanted the cells subcu- 
taneously into NSG mice. MCT1 knockdown had little effect on the 
growth of the subcutaneous tumours (Extended Data Fig. 6c), but 
significantly reduced the frequency of circulating melanoma cells 
in the blood (Extended Data Fig. 6d), and metastatic disease burden 
in all three melanomas (Extended Data Fig. 6e). The overexpression 
of an shRNA- insensitive MCTI cDNA (Extended Data Fig. 6f) rescued 
these effects (Extended Data Fig. 6h) without affecting subcutaneous 
tumour growth (Extended Data Fig. 6g). 

MCT1 overexpression in inefficiently metastasizing melanoma cells 
significantly increased metastatic burden in vivo without affecting 
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Fig. 3 | MCT1 inhibition causes oxidative stress in melanoma cells. a—c, 
Representative flow cytometry histograms of ROS levels (a) and fold change in 
mean fluorescence intensity (b, c) inmelanoma cells from mice treated with 
AZD3965 (blue) or DMSO control (black) (two experiments per melanoma). 
Thenumber of tumours or mice analysed per treatment is indicated. Unst., 
unstained. d-f, Growth of subcutaneous tumours (d) inxenografted mice 
treated with DMSO, AZD3965, N-acetyl cysteine (NAC), or AZD3965 plus NAC, 
as wellas the frequency of circulating melanoma cells in the blood (e) and 


subcutaneous tumour growth (Extended Data Fig. 7e-g). MCT1is thus 
able to increase metastasis in at least some melanomas. 

Wealso inhibited MCT1 in mouse melanomas” in immunocompetent 
C57BL mice (AZD3965 also has activity against mouse MCT1”). MCT1 
inhibition by treatment with AZD3965 (Fig. 2j, k) or CRISPR-mediated 
deletion (Extended Data Fig. 7a—c) reduced metastatic disease burden 
without significantly affecting the growth of subcutaneous tumours. 
Human and mouse melanomas thus became more dependent on MCT1 
function during metastasis in both immunocompromised and immu- 
nocompetent environments. 


metastatic disease burden based on bioluminescence imaging at end point (f). 
Dataineandfreflect three experiments per melanoma, but only one 
representative experiment per melanomais shownind. Dataare meant+s.d. 
Statistical significance was assessed using log,-transformed ¢-tests (b), 
Mann-Whitney tests (c), nparLD followed by Benjamini-Hochberg’s multiple 
comparisons adjustment (d) and log,-transformed one-way ANOVA with 
Holm-Sidak’s multiple comparisons adjustment (e, f). 


MCT1 promotes cell survival during metastasis 


Inhibition of MCT1 with AZD3965 did not impair the migration or 
invasion of melanoma cells in culture (Extended Data Fig. 8a). Acute 
treatment with AZD3965 for 7 days in mice with established subcutane- 
ous and metastatic tumours did not significantly affect the growth of 
subcutaneous or metastatic tumours, but did reduce the frequency of 
melanoma cells in the blood (Extended Data Fig. 8b, c). This suggests 
that MCT1 inhibition reduced the survival of melanoma cells during 
metastasis. 
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Fig. 4 | MCT1inhibition reduces flux through the oxidative branch of the 
PPP relative to glycolysis. a, Glucose m+ 2 asa fraction of total glucose in 
xenografted tumours after infusion of [1,2-°C] glucose (six experiments). 
Thenumber of tumours or mice per treatment is indicated. b, The lactate 
m+1/lactate m+2 ratio in subcutaneous tumours from the same mice (two 
experiments per melanoma).c, d, Intracellular pH (c) and NAD*/NADH ratio (d) 
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in dissociated melanoma cells from subcutaneous tumours (one experiment 
per melanoma). e-j, Fractional enrichment in glycolytic (e-h) and PPP (i,j) 
metabolites, 30, 60 or 180 min after [U-°C] glucose infusion (two experiments). 
Fructose-6-P, fructose-6-phosphate. Data are mean +s.d. Statistical 
significance was assessed using t-tests (a, band d), nparLD (c) or repeated 
measures two-way ANOVA (e-j). 
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Fig. 5| Heterogeneous MCT1 expression among melanoma cells fromthe 
same tumour. a—d, Flow cytometric analysis of anti-MCT1 staining in 
melanoma cells from subcutaneous tumours (a, c) or circulating melanoma 
cells (b, d) from the same mice xenografted with M405 (a, b) or M481 (c, d) (the 
gating strategies to identify human melanoma cells is in Extended Data Fig. 9e, 
f; dataare representative of two experiments). e, Flowcytometrically isolated 
MCTI1"£" or MCTI™/'™ melanoma cells were intravenously transplanted into NSG 


To test this further, we resected the primary tumours to extend 
mouse survival (see schematic in Extended Data Fig. 8d). Treatment 
with AZD3965 before primary tumour resection, when cells were spon- 
taneously metastasizing, significantly reduced metastatic tumour 
burden (Extended Data Fig. 8e). By contrast, treatment with AZD3965 
only after primary tumour resection—after metastatic tumours were 
established—did not reduce metastatic tumour burden (Extended 
Data Fig. 8e). Melanoma cells are, therefore, particularly dependent 
on MCTI1 during metastasis. 

Analyses of clinical data” and TCGA data showed that higher expres- 
sion of MCT1is associated with significantly worse overall survival 
(Extended Data Fig. 9a). Differences in MC72 (also known as SLC16A7) 
or MCT4 (SLCI6A3) expression did not significantly affect survival 
(Extended Data Fig. 9b, c). Consistent with the correlation between 
CD147 and MCTlexpression”, higher CD147 expression was also associ- 
ated with significantly worse survival (Extended Data Fig. 9d). 


Inhibition of MCT1 induces oxidative stress 


Inhibition of MCT1or MCT4 in cancer cells in culture promotes oxida- 
tive stress by inhibiting lactate export, leading to reduced glycolysis”. 
AZD3965 treatment increased levels of reactive oxygen species (ROS) 
in all three melanomas (Fig. 3a—c; see Extended Data Fig. 9e, f for the 
gating strategy to identify melanoma cells), as did deletion of MCT1 
from YUMM cells (Extended Data Fig. 7d). AZD3965 did not increase 
ROS levels in melanomas after shaRNA-mediated knockdown of MCT1, 
which suggests an on-target effect (Extended Data Fig. 6i). AZD3965 also 
reduced the ratios of GSH to oxidized glutathione (GSSG) (Extended 
Data Fig. 10a) and levels of NADPH (Extended Data Fig. 10b). Moreover, 
treatment with the antioxidant N-acetyl cysteine (NAC) rescued the 
effects of AZD3965 on circulating melanoma cells and metastatic 
disease burden (Fig. 3d-f). MCT1 inhibition thus impairs metastasis at 
least partly by increasing oxidative stress. 

To test whether MCT1 inhibition affected the pentose phosphate 
pathway (PPP), we infused [1,2-°C]glucose into xenografted mice and 
compared the relative flux of labelled glucose through glycolysis versus 
the oxidative PPP by comparing the ratio of m+1 lactate (derived from 


mice, using 100 or 1,000 cells per injection. The percentage of injections that 
formed metastatic tumours is shown (one or two experiments per melanoma). 
The number of mice analysed per treatment is indicated. f, Metastatic disease 
burden inthe visceral organs of mice that survived to end point after injection 
with 100 (M405 and M481) or 1,000 (UT10) cells based on bioluminescence 
signal intensity. Data are mean +s.d. Statistical significance was assessed using 
multiple linear regression (e) or Mann-Whitney tests (f). 


the oxidative PPP) tom +2 lactate (derived from glycolysis)” (Extended 
Data Fig. 10c). We observed a trend towards increased glucose enrich- 
ment in tumours treated with AZD3965 (Fig. 4a). We consistently 
observed a lower m+ 1lactate/m + 2 lactate ratio in AZD3965-treated 
as compared to control tumours for all three melanomas (Fig. 4b, 
Extended Data Fig. 10d). This suggests that MCTI1 inhibition reduced 
flux through the oxidative PPP relative to glycolysis. 

After infusion of [U-°C]glucose into xenografted mice (Extended 
Data Fig. 10e), AZD3965 treatment did not alter isotope enrichment in 
glucose or glycolytic intermediates (Fig. 4e-h), but reduced isotope 
enrichmentin the oxidative PPP (Fig. 4i, j). AZD3965 treatment did not 
generally reduce the levels of glycolytic intermediates (Extended Data 
Fig. 10f, h) but did reduce the levels of oxidative PPP intermediates 
(Extended Data Fig. 10g, i). Therefore, the effect of MCT1 inhibition in 
melanoma cells in vivo (inhibition of lactate import, favouring glyco- 
lysis over the PPP) was quite different from MCT1 inhibition in culture 
(inhibition of lactate export, reducing glycolysis”). In lung cancer, 
MCT1 deletion also reduced lactate export and glycolysis in culture, 
but reduced lactate uptake and enhanced glucose metabolism in vivo”. 

Lactate import can alter intracellular pH and the NAD*/NADH ratio 
because lactate is co-transported with a proton and converted to pyru- 
vate intracellularly, converting NAD* to NADH”. Consistent with this, in 
allthree melanomas, AZD3965 treatment significantly increased intra- 
cellular pH (Fig. 4c), strongly suggesting substantial MCT1-dependent 
lactate and proton import in these tumours. The increase in pH after 
MCT1inhibition could reduce flux through the PPP relative to glycoly- 
sis as increased pH activates the activity of phosphofructokinase and 
suppresses the activity of glucose-6-phosphate dehydrogenase**>— 
rate-limiting enzymes in glycolysis and the PPP, respectively. AZD3965 
treatmentalso significantly increased the NAD*/NADH ratios (Fig. 4d), 
which has the potential to enhance glycolysis at the expense of the PPP. 


Heterogeneity in MCT1 expression 


Flow cytometry revealed a more prominent MCT1"2" cell popula- 
tion among melanoma cells in the blood (see arrows in Fig. 5b, d) as 
compared with subcutaneous tumours in the same mice (Fig. 5a, c). 
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This suggests that surface MCTI1 was upregulated in circulating cells 
to increase PPP function, or that MCT1"2" cells preferentially survived 
during metastasis. 

Totest whether differences in MCT1 expression conferred differences 
in metastatic potential, we isolated MCT1"2" and MCT1“™ melanoma 
cells by flow cytometry from subcutaneously growing M405, M481 
and UT10 xenografts and then transplanted the cells either subcu- 
taneously (where oxidative stress does not appear to be limiting for 
tumour formation) or intravenously (where oxidative stress is limiting 
for tumour formation)®. MCT1"®" and MCT1”™ cells did not differ in 
their ability to form subcutaneous tumours or the rates at which the 
subcutaneous tumours grew (Extended Data Fig. 10j). By contrast, 
after intravenous injection, MCT1"®" cells formed significantly more 
metastatic tumours than MCT1”™ cells (Fig. 5e) and the metastatic 
disease burden in visceral organs was significantly greater (Fig. 5f). 
This suggests that differences in MCT1 expression confer differences 
in the ability to survive during metastasis. 

The ability of MCT1 to export lactate and to transport other monocar- 
boxylates bidirectionally’*’”° may contribute to its ability to promote 
metastasis. Other MCT transporters, such as MCT4, may also influence 
the survival of melanoma cells during metastasis. Lactate taken up by 
melanoma cells via MCT1 probably has several metabolic fates. Some of 
the lactate, or pyruvate generated from the lactate, might be exported 
from the cell?°. The conversion of imported lactate to pyruvate gener- 
ates NADH and a proton and could therefore stimulate PPP flux by 
reducing both intracellular pH and the NAD*/NADH ratio, even if the 
resulting pyruvate is exported from the cell. 
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Methods 


Melanoma specimen collection and enzymatic tumour 
disaggregation 

Melanoma specimens were obtained with informed consent from 
patients according to protocols approved by the Institutional Review 
Board of the University of Michigan Medical School (IRBMED approv- 
als HUM00050754 and HUMO00050085) and the University of Texas 
Southwestern Medical Center (IRB approval 102010-051). Materials 
used inthe manuscript are available, either commercially or from the 
authors, though there are restrictions imposed by Institutional Review 
Board requirements and institutional policy on the sharing of materials 
from patients. Single-cell suspensions were obtained by dissociat- 
ing tumours in Kontes tubes with disposable pestles (VWR) followed 
by enzymatic digestion in 200 U mI collagenase IV (Worthington), 
DNase (50 U mI‘) and5 mM CaCl, for 20 minat 37 °C. Cells were filtered 
through a 40-um cell strainer to remove clumps. 


Mouse studies and xenograft assays 

All mouse experiments complied with all relevant ethical regulations 
and were performed according to protocols approved by the Insti- 
tutional Animal Care and Use Committee at the University of Texas 
Southwestern Medical Center (protocol 2016-101360). Melanoma 
cell suspensions were prepared for injection in staining medium (L15 
medium containing bovine serum albumin (1 mg mI’), 1% penicillin/ 
streptomycin and 10 mM HEPES (pH 7.4) with 25% high-protein Matrigel 
(product 354248; BD Biosciences)). Subcutaneous injections were 
performed in the right flank of NOD.CB17-Prkdc“ l2rg"!"“"/Sz) (NSG) 
mice ina final volume of 50 pl. Four-to-eight-week-old male and female 
NSG mice were transplanted with 100 melanoma cells subcutaneously 
unless otherwise specified. Mouse cages were randomized between 
treatments (mice within the same cage had to be part of the same treat- 
ment). Both male and female mice were used. Subcutaneous tumour 
diameters were measured weekly with callipers until any tumour in 
the mouse cohort reached 2.5 cm in its largest diameter, in agreement 
with the approved animal protocol. At that point, all mice inthe cohort 
were euthanized and spontaneous metastasis was evaluated by gross 
inspection of visceral organs for macrometastases and biolumines- 
cence imaging of visceral organs to quantify metastatic disease burden 
(see details below). 

YUMML.7 (Braf’©°*;Pten;Cdkn27), YUMM3.3 (Braf."";;Cdk 
n2a"),and YUMM5.2 (Braf’*,p53 ) (p53is also known as 7rp53) cell 
lines*° were obtained from and authenticated by ATCC and cell lines 
were confirmed to be mycoplasma free using the MycoAlert detection 
kit (Lonza). YUMMI1.7, YUMM3.3 and YUMMS.2 were transfected with 
dsRed2 and luciferase (dsRed2-P2A-Luc) for bioluminescence imaging. 
Subcutaneous injections of 20,000-50,000 cells were performed inthe 
right flank of 6-to-8-week-old male and female C57BL/6 mice in 50 pl. 

For studies that involved treatment with the MCT1 inhibitor 
(AZD3965, Selleckchem), when subcutaneous tumours became pal- 
pable, the mice were administered AZD3965 by oral gavage every sec- 
ond day in xenografted mice and every day for mice transplanted with 
YUMM cells (30 mg kg body mass in 200 1 of 0.5% promethylcellulose, 
0.2% Tween80 and 5% DMSO). Tumour growth was monitored weekly 
witha calliper. Mice were euthanized when the primary tumour reached 
2.5cminits largest diameter. In addition to measuring subcutaneous 
tumour diameters, the frequency of circulating melanoma cellsinthe 
blood (obtained by cardiac puncture) was measured by flowcytometry, 
and metastatic disease burden was measured by total bioluminescence 
levels in dissected visceral organs. 


Bioluminescence imaging 

Metastatic disease burden was monitored using bioluminescence 
imaging (all melanomas were tagged with stable expression of 
luciferase). Five minutes before performing luminescence imaging, 


mice were injected intraperitoneally with 100 ul of PBS containing 
D-luciferin monopotassium salt (40 mg ml“) (Biosynth) and mice 
were anaesthetized with isoflurane 2 min before imaging. All mice 
were imaged using an IVIS Imaging System 200 Series (Caliper Life 
Sciences) with Living Image software. After completion of whole- 
body imaging, mice were euthanized and individual organs were 
surgically removed and imaged. The exposure time ranged from 
10 to 60 s, depending on the maximum signal intensity, to avoid 
saturation of the luminescence signal. To measure the background 
luminescence, a negative control mouse not transplanted with mela- 
noma cells was imaged. The bioluminescence signal (total photon 
flux) was quantified with ‘region of interest’ measurement tools in 
Living Image (Perkin Elmer) software. Metastatic disease burden 
was calculated as observed total photon flux across all organs in 
xenografted mice minus background total photon flux in negative 
control mice. Negative values were set to 1 for purposes of presenta- 
tion and statistical analysis. 


Cell labelling and flow cytometry 

Melanoma cells were identified and sorted by flowcytometry as previ- 
ously described*. All antibody staining was performed for 20 min onice, 
followed by washing with HBBS and centrifugation at 200g for 5 min. 
Cells were stained with directly conjugated antibodies against mouse 
CD45 (violetFluor 450, eBiosciences), mouse CD31 (390-eFluor450, Bio- 
legend), mouse Ter119 (eFluor450, eBiosciences) and human HLA-ABC 
(G46-2.6-FITC, BD Biosciences). Human melanoma cells were isolated 
as cells that were positive for HLA and negative for mouse endothelial 
and haematopoietic markers. Cells were washed with staining medium 
and re-suspended in 4’,6-diamidino-2-phenylindole (DAPI; 1 pg mI; 
Sigma) to eliminate dead cells from sorts and analyses. To analyse 
other markers, cells were stained with Alexa Fluor647-conjugated 
anti-human MCTI (Bioss antibodies), Alexa Fluor488-conjugated 
anti-human CD147, PE-Vio770-conjugated anti-human CD98, Alexa 
Fluor700-conjugated anti-human B,-integrin, FITC-conjugated anti-E- 
cadherin (CD324) or PE/Cy7-conjugated anti-N-cadherin (CD325). Cells 
were examined on anLSRFortessa cell analyser (Becton Dickinson) or 
sorted ona FACS Fusion Cell Sorter (Becton Dickinson). For analysis 
of circulating melanoma cells, blood was collected from mice by car- 
diac puncture witha syringe pretreated with citrate-dextrose solution 
(Sigma) when subcutaneous tumours reached 2.5 cm in diameter. Red 
blood cells were sedimented using Ficoll, according to the manufac- 
turer’s instructions (Ficoll Paque Plus, GE Healthcare). Remaining cells 
were washed with HBSS (Invitrogen) before antibody staining and 
flow cytometry. 


Lentiviral/shRNA transduction of human melanoma cells 

All melanomas expressed DsRed and luciferase as previously 
described?™. AllshRNAs were expressed froma pGFP-C-shLenti vector 
(Origene). For knockdown of MCT, Origene shRNA clones TL3094.05A 
(5’-GAGGAAGAGACCAGTATAGATGT TGCTGG-3’) and TL309405B 
(S’-ATCCAGCTCTGACCATGAT TGGCAAGTAT-3’) were used. For over- 
expression of MCT1, the human open reading frame was obtained 
from the Precision LentiORF collection (Dharmacon) ina bicistronic 
lentiviral construct that co-expressed turbo green fluorescent protein 
(pLOC-MCT1-IRES-tGFP). As a control, turbo red fluorescent protein 
(tRFP) was expressed in place of MCT1 in the same construct (pLOC- 
tRFP-IRES-tGFP). In rescue experiments, the MCT1 cDNA was mutated 
to change wobble bases in 10 consecutive codons to render the MCT1 
cDNA insensitive to the anti: MCT1shRNAs we used without affecting the 
amino acid sequence (5’-GAGGAAGAGACCAGTATAGATGT TGCTGGG-3’ 
to 5’-GAAGAGGAAACTAGCAT TGACGTCGCAGGC-3’ for shRNA #1 and 
5’-AATCCAGCTCTGACCATGAT TGGCAAGTAT-3’ to 5’-AACCCGGCCC 
TAACGATGATAGGGAAATAC-3’ for shRNA #2). The shRNA-resistant 
MCT1 sequence was cloned into the pLVX-EFla-IRES-mCherry lentiviral 
vector to infect melanoma cells. 


Article 


For virus production, 0.9 pg of the appropriate plasmid together 
with 1 pg of helper plasmids (0.4 ug pMD2G and 0.6 1g of psPAX2) were 
transfected into 293T cells using PolyJet (SignaGen) according to the 
manufacturer’s instructions. The resulting replication-incompetent 
viral supernatants were collected at 48 h after transfection and filtered 
through a 45-um filter. Then, 300,000 freshly dissociated melanoma 
cells were infected with viral supernatants supplemented with 10 pg mI 
polybrene (Sigma) for 4h. Cells were then washed twice with staining 
medium (L15 medium containing bovine serum albumin (1 mg mI), 
1% penicillin/streptomycin and 10 mM HEPES (pH 7.4)), and approxi- 
mately 25,000 cells (a mixture of infected and non-infected cells) were 
suspended in staining medium with 25% high-protein Matrigel (product 
354248; BD Biosciences) and then injected subcutaneously into NSG 
mice. After growing to 1-2 cm in diameter, the tumours were excised 
and dissociated into single-cell suspensions as described above. DsRed 
and GFP double-positive cells were sorted and transplanted into NSG 
mice for in vivo studies to assess the effect of each shRNA construct 
ontumour growth and metastasis. 


CRISPR editing of MCT1 in mouse melanoma cells 

Single-guide RNAs (sgRNAs) targeting exon 2 of mouse MctI1 were 
designed using publicly available tools (http://crispr.mit.edu): Mct1 
sgRNA #1, 5’- AAATGCCACCTGCGATTGGA-3’; Mctl sgRNA #2, 5’— 
ATGGATATCATCTATAATGT-3’. The sgRNAs were cloned into the 
U6-driven Cas9 expression vector (pX458-pSpCas9(BB)-2AGFP; 48318, 
Addgene)**. Approximately 100,000 YUMMI1.7 mouse melanoma cells 
were plated in tissue-culture-treated 6-well plates in DMEM low glucose 
plus 10% fetal bovine serum (FBS) and 1% penicillin/streptomycin. One 
microgram of each of the two sgRNA constructs was co-transfected 
into the melanoma cells using PolyJet (SignaGen) according to the 
manufacturer’s instructions. After 48 h, GFP’ cells were sorted into 
96-well plates with DMEM low glucose plus 10% FBS and 1% penicillin/ 
streptomycin at clonal density, then clones were expanded and genomic 
DNA was isolated to screen for MCT1 exon 2 deletions. 


Cellinvasion 

Transwell invasion assays were carried out using Corning BioCoat 
Tumour Invasion Systems (354165, Corning) as previously described”. 
In brief, 5 x 10° cells were seeded in the upper chamber of each well in 
serum-free culture medium. FBS (10%) in DMEM in the lower cham- 
ber was used as the chemoattractant. The invasive cells that migrated 
across the insert towards the lower chamber were stained with crystal 
violet solution after 24 h of incubation at 37 °C in 5% CO,. Images were 
captured using an Olympus microscope with a DP71 high-resolution 
digital camera and cells were counted using Image]. 


In vivo isotope tracing 

Allin vivo isotope tracing experiments were performed when subcuta- 
neous tumours reached 2 cm in diameter. Before infusions, mice were 
fasted for 16 h, then a27-gauge catheter was placed in the lateral tail vein 
under anaesthesia. We intravenously infused [U-°C]glutamine (CLM- 
1822, Cambridge Isotope Laboratories) as a bolus of 0.1725 mg g ‘body 
mass over 1 min in 150 pl of saline, followed by continuous infusion of 
0.00288 mg g ‘body mass per min for 5h (ina volume of 150 pl h“)*®. For 
infusions of [U-°C]glucose (CLM-1396, Cambridge Isotope Laboratories) 
and [1,2-°C]glucose (CLM-504, Cambridge Isotope Laboratories), we 
intravenously infused a bolus of 0.4125 mg g‘ body mass over 1 minin 
125 ul of saline, followed by continuous infusion of 0.008 mg g body 
mass per min for 3h (ina volume of 150 plh’)”. Atthe end of the infusion, 
mice were killed and tumours were collected and immediately frozen 
in liquid nitrogen. To assess the fractional enrichments in plasma, 20 
pl of blood was obtained after 30, 60, 120 and 180 min of infusion. For 
[U-?C]lactate (CLM-1579, Cambridge Isotope Laboratories) and [27H] 
lactate (693987, Sigma-Aldrich) infusion, we intravenously infused a 
bolus of 0.24 mg g body mass over 10 min in15 ul of saline, followed by 


continuous infusion of 0.0048 mg g ‘body mass per min for 3h (in120 
ph)”. Care was taken during infusions not to increase blood glucose 
or lactate concentrations over pre-infusion levels. 


Gas chromatography mass spectrometry 

For gas chromatography-tandem mass spectrometry (GC-MS), subcu- 
taneous tumour fragments weighing 5-15 mg were homogenized using 
anelectronic tissue disruptor (Qiagen) in ice-cold 80:20 methanol:water 
(v/v) followed by three freeze-thaw cycles in liquid nitrogen. The super- 
natant was collected after a 10-min centrifugation at 13,000g at 4 °C 
then lyophilized. To analyse isotope enrichment in the plasma, whole 
blood was chilled on ice then centrifuged for 1 min at 13,000g at 4 °C 
to separate the plasma. Aliquots of 20-40 ul of plasma were added to 
80:20 methanol:water to extract the metabolites, then lyophilized using 
a SpeedVac (Thermo), and re-suspended in 40 ul anhydrous pyridine. 
This solution was added to pre-prepared GC-MS autoinjector vials 
containing 80 pl N-(tert-butyldimethylsilyl)-N-methyltrifluoroacet- 
amide (MTBSTFA) to derivatize polar metabolites. The samples were 
incubated at 70 °C for 1h, then aliquots of 1 pl were injected for analy- 
sis. Samples were analysed using either an Agilent 6890 or an Agilent 
7890 gas chromatograph coupled to an Agilent 5973N or 5975C Mass 
Selective Detector, respectively. The observed distributions of mass 
isotopologues were corrected for natural abundance”. 


Metabolomic analysis 

HILIC chromatographic separation of metabolites was achieved using a 
Millipore ZIC-pHILIC column (5 pm, 2.1x 150 mm) witha binary solvent 
system of 10 mM ammonium acetate in water, pH 9.8 (solvent A) and 
acetonitrile (solvent B) with a constant flow rate of 0.25 ml min”. For gra- 
dient separation, the column was equilibrated with 90% solvent B. After 
injection, the gradient proceeded as follows: 0-15 min linear ramp from 
90% B to 30% B; 15-18 min isocratic flow of 30% B; 18-19 min linear ramp 
from 30% B to 90% B; 19-27 column regeneration with isocratic flow of 
90% B. Metabolites were measured witha Thermo Scientific QExactive 
HF-X hybrid quadrupole orbitrap high-resolution mass spectrometer 
(HRMS) coupled to a Vanquish UHPLC. HRMS data were acquired with 
two separate acquisition methods. Individual samples were acquired 
with an HRMS full scan (precursor ion only) method switching between 
positive and negative polarities. For data-dependent, high-resolution 
tandem mass spectrometry (ddHRMS/MS) methods, precursor ion 
scans were acquired at a resolving power of 60,000 full width at half- 
maximum (FWHM) with a mass range of 80-1,200 Da. The AGC tar- 
get value was set to 1 x 10° with a maximum injection time of 100 ms. 
Pooled samples were generated from an equal mixture of all individual 
samples and analysed using individual positive- and negative-polarity 
spectrometry ddHRMS/MS acquisition methods for high-confidence 
metabolite ID. Production spectra were acquired at a resolving power 
of 15,000 FWHM without a fixed mass range. The AGC target value was 
set to 2 x 10° with amaximum injection time of 150 ms. Data-dependent 
parameters were set to acquire the top 10 ions with a dynamic exclusion 
of 30s andamass tolerance of 5 ppm. Isotope exclusion was turned on 
and a stepped normalized collision energy applied with values of 30, 
50 and 70. Settings remained the same in both polarities. 

Metabolite identities were confirmed in three ways: (1) precursor 
ion m/z was matched within 5 ppm of theoretical mass predicted by 
the chemical formula; (2) fragment ion spectra were matched withina 
5 ppm tolerance to known metabolite fragments; and (3) the retention 
time of metabolites was within 5% of the retention time of a purified 
standard run with the same chromatographic method. Metabolites 
were relatively quantitated by integrating the chromatographic peak 
area of the precursor ion searched within a5 ppm tolerance. 


GSH/GSSG analysis by LC-MS/MS 


For analysis of the GSH to GSSG ratio by liquid chromatography- 
tandem mass spectrometry (LC-MS/MS), subcutaneous tumour 


fragments weighing 5-15 mg were homogenized using an electronic 
tissue disruptor (Qiagen) in ice-cold 80:20 methanol:water (v/v), with 
0.1% formic acid to prevent spontaneous oxidation”, followed by three 
freeze-thaw cycles in liquid nitrogen. The supernatant was collected 
after a10-min centrifugation at 13,000g at 4 °C then lyophilized. Lyo- 
philized samples were reconstituted in 100 ul of 0.1% formic acid in 
water, vortexed and analysed by LC-MS/MS. GSH/GSSG analysis was 
performed using a SCIEX 6500+ Q-Trap mass spectrometer coupled 
to a Shimadzu LC-20A UHPLC system. Chromatographic separation 
was Carried out with a Waters HSS T3 column and a binary solvent 
gradient of water with 0.1% formic acid (solvent A) and acetonitrile 
with 0.1% formic acid (solvent B). The following gradient was used for 
separation: 0-3 min, isocratic flow of 0% B; 3-8 min, O-100% B; 8-13 
min, isocratic flow of 100% B; 13-13.1 min, 1OO-0% B; 13.1-18 min, iso- 
cratic flow of 0% B. The flow rate was held constant at 0.2 ml min“. The 
mass spectrometry analysis was operated in MRM mode monitoring 
the following transitions for GSH, GSSH and their respective internal 
standards in positive mode: GSH 308/162; GSSG 613/355; GSH internal 
standard (ISTD) 311/165; GSSG ISTD 619/165. Transitions and source 
parameters were optimized by infusion before analysis. GSH/GSSG 
ratios were calculated by first determining the molar values of GSH and 
GSSG individually using a standard curve and the addition of internal 
standards. Data are reported as the ratio of calculated molar values. 


8C tracing analysis for glycolytic and PPP metabolites 

The theoretical masses of °C isotopes of glycolytic and PPP metabolites 
were calculated and added to a library of predicted isotopes. These 
masses were then searched witha5 ppm tolerance and integrated only if 
the peak apex showed less than 1% difference in retention time from the 
[U-”C] monoisotopic mass in the same chromatogram. After analysis 
of the raw data, theoretical natural abundance was calculated. Natural 
isotope abundances were corrected using a customized R script, which 
can be found at the GitHub repository (https://github.com/wencgu/ 
nac). The script was written by adapting the AccuCor algorithm”. 


NAD*/NADH analysis by LC-MS/MS 

Analysis of NAD*/NADH levels was performed on 5-15 mg tumour speci- 
mens. Tissues were homogenized manually with a pestle in ice-cold 
80:20 methanol:water (v/v). After thorough homogenization, samples 
were spun at 13,000g for 15 min at 4 °C. Samples were then transferred 
to afresh conical tube and spun for an additional 10 min at 13,000g at 
4 °C. The supernatant was placed directly into autosampler vials for 
analysis by LC/MS. 

NAD*/NADH measurements were carried out on a Thermo Scien- 
tific QExactive HF-X hybrid quadrupole orbitrap HRMS coupled toa 
Vanquish UHPLC. Chromatographic separation of metabolites was 
achieved using a Millipore ZIC-pHILIC column (5 pm, 2.1 x 150 mm) 
with a binary solvent system of 10 mM ammonium acetate in water, 
pH 9.8 (solvent A) and acetonitrile (solvent B) with a constant flow rate 
of 0.25 ml min“. For gradient separation, the column was equilibrated 
with 90% solvent B. After injection, the gradient proceeded as follows: 
0-15 min linear ramp from 90% B to 30% B; 15-18 min isocratic flow of 
30% B; 18-19 min linear ramp from 30% B to 90% B; 19-27 min of column 
regeneration with isocratic flow of 90% B. 

HRMS data were acquired with two different methods. Pooled sam- 
ples were generated from an equal mixture of all individual samples 
and were analysed using individual positive- and negative-polarity 
ddHRMS/MS for high-confidence metabolite ID. Individual conditions 
were acquired with an HRMS full scan (precursor ion only) switching 
between positive and negative polarities. For ddHRMS/MS methods, 
precursor ion scans were acquired at a resolving power of 60,000 
FWHM, with a mass range of 80-1,200 Da. The automated gate con- 
trol (AGC) target value was set to 10°, with a maximum injection time 
of 100 ms. Product ion spectra were acquired at a resolving power of 
15,000 FWHM without a fixed mass range. The AGC target value was 


set to 2 x 10° with amaximum injection time of 150 ms. Data-dependent 
parameters were set to acquire the top 10 ions with a dynamic exclusion 
of 30 s and a mass tolerance of 5 ppm. Isotope exclusion was turned 
on and the normalized collision energy was set to a constant value of 
30. Settings remained the same in both polarities. Polarity-switching 
HRMS full scan data were acquired with a resolving power of 60,000 
FWHM and amass range of 80-1,200 Da; the AGC target was set to 10° 
and a maximum injection time of 100 ms. NAD*/NADH ratios were 
determined by integrating the extracted ion chromatograms for 
NAD* in positive mode (m/z = 664.1164) and NADH in negative mode 
(m/z = 664.1175). Fragmentation spectra from pooled samples were 
used for structural confirmation of NAD* and NADH. 


NADPH/NADP* measurement 

Subcutaneous tumours were surgically excised as quickly as possible 
after killing the mice, then melanoma cells were mechanically dis- 
sociated and NADPH and NADP* were measured using the NADPH/ 
NADP Glo-Assay (Promega) following the manufacturer’s instructions. 
Standard curves were generated using purified NADP* (N-5755, Sigma- 
Aldrich) and NADPH (N-6705, Sigma-Aldrich) prepared in the same buff- 
ers used for the experimental samples. The absolute amounts of NADP+ 
and NADPH in each sample were then determined using these standard 
curves. Luminescence was measured using a using a FLUOstar Omega 
plate reader (BMG Labtech). Values were normalized to tissue mass. 


Assays for ROS levels and intracellular pH 

Subcutaneous tumours were surgically excised as quickly as possible 
after euthanizing the mice, and then melanoma cells were mechani- 
cally dissociated in 700 ul of staining medium. Single-cell suspensions 
were obtained by passing the dissociated cell suspensions through a 
40-um cell strainer. To analyse ROS levels, equal numbers of dissociated 
cells from each treatment were stained for 30 min at 37 °C with 5 mM 
CellROX Green or CellROX DeepRed (Life Technologies) in HBSS-free 
(Ca** and Mg”*-free) and DAPI (to distinguish live from dead cells). The 
cells were then washed and analysed by flow cytometry using either a 
FACS Fusion or a FACS Fortessa (BD Biosciences) to assess ROS levels 
inlive human melanoma cells (positive for human HLA and dsRed and 
negative for DAPI and mouse CD45/CD31/Ter119). 

To assess intracellular pH, equal numbers of dissociated cells from 
each treatment were stained with a pH-dependent ratiometric dye, 
Seminaphthorhodaflouor-1 (Acetoxymethyl Ester) (SNARF1)* in HBSS- 
free, and DAPI. We generated standard curves by incubating dissociated 
melanomacells with pH 5.5, pH 6.5 or pH 7.5 buffers in the presence of 10 
mM valinomycin and nigercin (ionophores that allowed the cytoplasm 
to equilibrate with extracellular pH; Intracellular pH Calibration Buffer 
Kit, Life Technologies). SNARF1 fluorescence was measured by flow 
cytometry as described above and then converted to pH values using 
the standard curves. 


Western blot analysis 
We used HCC1S cell lines as positive and negative controls for MCT1 
and MCT4 expression (previously described”). The identity of the 
HCC15 cells was confirmed using DNA fingerprinting and they were 
confirmed to be mycoplasma free using the e-Myco kit (Bulldog bio). 
MCF7 cell lines were used as a positive control for MCT2. MCF7 cell 
lines were obtained from, and authenticated by, ATCC and confirmed 
to be mycoplasma free using the e-Myco kit (Bulldog bio). Melanomas 
were excised and quickly snap-frozen in liquid nitrogen. Tumour lysates 
were prepared in Kontes tubes with disposable pestles using RIPA Buffer 
(Cell Signaling Technology) supplemented with phenylmethylsulpho- 
nyl fluoride (Sigma), and protease and phosphatase inhibitor cocktail 
(Roche). The bicinchoninic acid protein assay (Thermo) was used to 
quantify protein concentrations. Equal amounts of protein (10-20 
Lig) were loaded into each lane and separated on 4-20% polyacryla- 
mide tris glycine SDS gels (BioRad), then transferred to polyvinylidene 
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difluoride membranes (BioRad). The membranes were blocked for 1 
hat room temperature with 5% milk in TBS supplemented with 0.1% 
Tween-20 (TBST) and then incubated with primary antibodies overnight 
at 4 °C. After washing, then incubating with horseradish peroxidase 
conjugated secondary antibody (Cell Signaling Technology), signals 
were developed using SuperSignal West (Thermo Fisher). Blots were 
sometimes stripped using Restore stripping buffer (Thermo Fisher) 
and re-stained with other primary antibodies. The following antibodies 
were used for western blots: anti-MCT1 (AB3538P, Millipore), anti-MCT2 
(LN2021159, LabNed), anti-MCT4 (AB3316P, Millipore), anti-CD147 
(ab64616, Abcam), anti-LDHA (C4B5, Cell Signaling Technologies), 
anti-LDHB (ab53292, Abcam), anti-IKKa (D3W6N, Cell Signaling Tech- 
nology), anti-IKKB (D30C6, Cell Signaling Technology), anti-vimentin 
(D21H3, Cell Signaling Technology), anti-tubulin (ab52866, Abcam) 
and anti-B-actin (D6A8, Cell Signaling Technologies). 


Immunofluorescence staining of frozen tissue sections 

Tissues were fixed in 4% paraformaldehyde overnight at 4 °C, washed 
in PBS and cryoprotected in 30% sucrose overnight. Tissues were then 
frozen in OCT (Fisher). Sections (10 um) were cut using a cryostat, 
washed three times in PBS for 5 min each, and blocked in 5% donkey 
serum (JacksonImmuno) in PBS for 1h at room temperature. Sec- 
tions were then stained with primary antibodies overnight: anti-MCT1 
(HPA003324, Sigma, 1:500) and anti-S100 (Z0311, Dako, 1:500). The 
next day, sections were washed three times in PBS for 5 min each and 
stained with secondary antibodies: Alexa Fluor488-AffiniPure F(ab’)2 
Fragment Donkey anti-Rabbit IgG, Cy3-AffiniPure F(ab’)2 Fragment 
Donkey anti-Rat IgG (JacksonImmuno) at 1:250 for 1h in the dark at 
room temperature. Sections were washed three times in PBS for 5 min 
each then stained with DAPI (1:1,000) and mounted with Flouromount- 
G (SouthernBiotech) for confocal imaging. 


Statistical methods 

Generally, several melanomas from different patients were tested in 
multiple independent experiments performed on different days. Mice 
were allocated to experiments randomly and samples processed in 
an arbitrary order, but formal randomization techniques were not 
used. Before analysing the statistical significance of differences among 
treatments, we tested whether data were normally distributed and 
whether variance was similar among treatments. To test for normality, 
we performed the Shapiro-Wilk tests when 3 <n < 20 or D'Agostino 
omnibus tests when n = 20. To test whether variability significantly 
differed among treatments we performed F-tests (for experiments 
with two treatments) or Levene’s median tests (for experiments with 
more thantwotreatments). When the data significantly deviated from 
normality (P< 0.01) or variability significantly differed among treat- 
ments (P< 0.05), we log,-transformed the data and tested again for 
normality and variability. If the transformed data no longer signifi- 
cantly deviated from normality and equal variability, we performed 
parametric tests on the transformed data. If log,-transformation was 
not possible or the transformed data still significantly deviated from 
normality or equal variability, we performed non-parametric tests on 
the non-transformed data. 

Allofthe statistical tests we used were two-sided, where applicable. 
Toassess the statistical significance of a difference between twotreat- 
ments, we used Student’s ¢-tests or paired t-tests (when a parametrictest 
was appropriate), Welch’s t-tests (when data were normally distributed 
but not equally variable) or Mann-Whitney or Wilcoxon tests (when 
anon-parametric test was appropriate). When it was possible to per- 
form a one-sided or a two-sided statistical test we always performed 
two-sided tests. Multiple t-tests (parametric or non-parametric) were 
followed by Holm-Sidak’s multiple comparisons adjustment. To assess 
the statistical significance of differences between more than twotreat- 
ments, we used one-way or two-way ANOVAs (when a parametric test 
was appropriate) followed by Holm-Sidak’s multiple comparisons 


adjustment or Kruskal-Wallis tests (when a non-parametric test was 
appropriate) followed by Dunn’s multiple comparisons adjustment. To 
assess the statistical significance of differences between time-course 
data, we used repeated-measures two-way ANOVAs (when a parametric 
test was appropriate and there were no missing data points) or mixed- 
effects analyses (when a parametric test was appropriate and there 
were missing data points) followed by Dunnett’s multiple compari- 
sons adjustment, or nparLD*— astatistical tool for the analysis of non- 
parametric longitudinal data, followed by the Benjamini-Hochberg 
method for multiple comparisons adjustment. To assess the statistical 
significance of overall differences between percentages of tumours 
formed by different treatments and cell doses in all melanomas, we 
used multiple linear regressions. To assess the statistical significance 
of differences in overall survival of TCGA SKCM patients, we used Man- 
tel-Cox’s log-rank tests. All statistical analyses were performed with 
Graphpad Prism 8.1 or R 3.5.1 with the stats, fBasics, car and nparLD 
packages. All data are mean +s.d. 

Sample sizes were not predetermined based on statistical power 
calculations but were based on our experience with these assays. For 
assays in which variability is commonly high, we typically used n> 10. 
For assays in which variability is commonly low, we typically used 
n<10. No data were excluded; however, mice sometimes died during 
experiments, presumably owing to the growth of metastatic tumours. 
Inthose instances, data that had already been collected onthe micein 
interim analyses were included (suchas subcutaneous tumour growth 
measurements over time) even if it was not possible to perform the 
end-point analysis of metastatic disease burden (due to the premature 
death of the mice). 

During all isotope tracing experiments, the data were analysed in 
a manner blinded to sample identity or treatment. A.T. performed all 
of the infusions, collected tumour specimens and performed mass 
spectrometry, then passed the de-identified data files to B.F.andA.S., 
who analysed the isotope tracing patterns. After the patterns had been 
analysed for individual mice, the samples were re-identified so the 
results could be interpreted. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


Source Data for Figs. 1-5 and Extended Data Figs. 1-10 are provided 
with the paper. All other data are available from the corresponding 
authors upon request. 


36. Ran, F. A. et al. Genome engineering using the CRISPR-Cas9Q system. Nat. Protocols 8, 
2281-2308 (2013). 

37. Shi, X. et al. The abundance of metabolites related to protein methylation correlates with 
the metastatic capacity of human melanoma xenografts. Sci. Adv. 3, eaao5268 (2017). 

38. Marin-Valencia, |. et al. Analysis of tumor metabolism reveals mitochondrial glucose 
oxidation in genetically diverse human glioblastomas in the mouse brain in vivo. Cell 
Metab. 15, 827-837 (2012). 

39. Yang, C. et al. Glutamine oxidation maintains the TCA cycle and cell survival during 
impaired mitochondrial pyruvate transport. Mol. Cell 56, 414-424 (2014). 

40. Tu,B.P. etal. Cyclic changes in metabolic state during the life of a yeast cell. Proc. Natl 
Acad. Sci. USA 104, 16886-16891 (2007). 

41. Su, X., Lu, W. & Rabinowitz, J. D. Metabolite spectral accuracy on Orbitraps. Anal. Chem. 
89, 5940-5948 (2017). 

42. Matsuyama, S., Llopis, J., Deveraux, Q. L., Tsien, R. Y. & Reed, J. C. Changes in 
intramitochondrial and cytosolic pH: early events that modulate caspase activation 
during apoptosis. Nat. Cell Biol. 2, 318-325 (2000). 

43. Noguchi, K., Gel, Y., Brunner, E. & Konietschke, F. nparLD: an R software package for the 
nonparametric analysis of longitudinal data in factorial experiments. J. Stat. Softw. 50, 
1-23 (2012). 


Acknowledgements S.J.M. is a Howard Hughes Medical Institute (HHMI) Investigator, the Mary 
McDermott Cook Chair in Pediatric Genetics, the Kathryn and Gene Bishop Distinguished Chair 
in Pediatric Research, the director of the Hamon Laboratory for Stem Cells and Cancer, anda 
Cancer Prevention and Research Institute of Texas Scholar. R.J.D. is an HHMI Investigator, the 
Robert L. Moody, Sr. Faculty Scholar at UT Southwestern and Joel B. Steinberg, M.D. Chair in 


Pediatrics. The research was supported by the Cancer Prevention and Research Institute of 
Texas (RP170114 and RP180778), the National Institutes of Health (R85 CA220449; UO1 
CA228608) and the Robert A. Welch Foundation (I-1733). A:T. was supported by the Else Kroner- 
Forschungskolleg and the Leopoldina Fellowship Program (LPDS 2016-16) of the German 
National Academy of Sciences. B.F. was supported by a postdoctoral fellowship from the 
Canadian Institutes of Health Research (MFE 140911). B.S. and A.S. were supported by Ruth L. 
Kirschstein National Research Service Award Postdoctoral Fellowships from the National 
Heart, Lung, and Blood Institute (F32 HL139016-01) and the National Institute of Child Health 
and Human Development (F32 HD096786-01). We thank A. Gross for mouse colony 
management as well as N. Loof and the Moody Foundation Flow Cytometry Facility. 


Author contributions AT., R.J.D. and S.J.M. conceived the project, and designed and 
interpreted experiments. A:T. performed most of the experiments. B.F., A.S., W.G. and R.J.D. 
participated in the design, analysis and interpretation of isotope tracing and metabolomics 
experiments. B.F., T.P.M. and R.J.D. developed methods for metabolomics and isotope tracing 
in vivo. V.R. and J.M.U. helped A‘T. to perform the in vivo tumorigenesis assays. J.M.U. 
performed the melanoma cell migration experiments in culture. B.S., Z.G. and SY.K. helped to 


design the CRISPR gene-targeting and MCT1 overexpression constructs, and generated the 
constructs. D.S. and TV. helped with the assessment and interpretation of MCT1 expression 
patterns in patient specimens. T.P.M. and M.M. performed all of the mass spectrometric 
analysis of metabolomic and isotope tracing specimens. M.M.M. performed the 
immunofluorescence analysis of MCT1 expression. Z.Z. performed statistical analyses. ATT., 
B.F., R.J.D. and S.J.M. wrote the manuscript. 


Competing interests R.J.D. is an advisor for Agios Pharmaceuticals. S.J.M. is an advisor for 
Frequency Therapeutics and Protein Fluidics 


Additional information 

Supplementary information is available for this paper at https://doi.org/10.1038/s41586-019- 
1847-2. 

Correspondence and requests for materials should be addressed to R.J.D. or S.J.M. 

Peer review information Nature thanks John Cleveland, Markus Ralser and the other, 
anonymous, reviewer(s) for their contribution to the peer review of this work. 

Reprints and permissions information is available at http://www.nature.com/reprints. 


Article 


a 


Fractional Enrichment 


Plasma Concentration 


Metastatic 
potential in 
mice 


Melanoma 


M597 61/M | uA [Primary cutaneous | None | 0% (0/22) 10% (1/10) 30% (3/10) 
umi7_| 72/F | WA | _Lymphnode | None _| _0% (0/10) 50% (5/10) 0% (0/10) 


Lymph node BRAF V600E | 0% (0/17) 0% (0/17) 0% (0/17) 
BRAF V600M 


um47__| Unk/M Lymphnode | _None__|_0% (0/49) 0% (0/19) 0% (0/49) 
67M | Unk. [ Lymphnode BRAF V600E | 6% (1/18) 94% (17/18) 0% (0/48) 


|_m715 | unk |__| Primary cutaneous |__None | _0% (0/19) 0% (0/19) 5% (1/19) 


oe | 721M [| Lymph node BRAF V600E | 0% (0/61) 42% (7117) 0% (0/17) 


|__M6i0__ | 76/F | _NA_| Primary cutaneous | BRAF V600E | 0% (0/27) 43% (3/7) 0% (0/7) 


UT10 ell Eel Lymph node NRAS Qi 61% (33/54) 


79% (22/28) 


Cur d 
Blood eV [[U-'C] Glutamine = Blood 
[U-*C] Glutamine = -e-Efficient, n=19 © [U-°C] Glucose 
0.4 E ns neh £0.4 
: £ 0.3 -#Inefficient, n=28 " 
o™ & 
= 2 
Cc = 
Wi 9.2 AT 
0.2 S 3 0.2 
2 8 
-e- Efficient, n=19 Ss 5 -e- Efficient, n=30 
0.0 -s-Inefficient, n=28 T 0.0 -# Inefficient, n=36 


0 60 180 300 
Time (min) 


0 60 120 180 
Time (min) 


f Blood g 
15) ePre-Infusion ¢93 ¢9.087 Bi00d 
ePost-Infusion @ co) [22H] Lactate 
= £ 
= £0.06 
20.2 x) 
= < 
tT) w0.04 
w & 
< 0.1 c¢ 
2 00.02 
2) -e-Efficient,n=19 © -e- Efficient, n=7 
0 © 0.04_—s Inefficient, n=38 9 g9,f —= Inefficient, n=9 
Glucose Lactate 0 60 120 180 0 60 120. } 
Time (min) Time (min 
ron . i °o & Efficient Inefficient 
a a a a a oe 
I! NADH NADF i 
ave Malate LDHA 
a) NAD+ NADH Oo Glutamate a-KG (0) 
ee U KRU UN Ke H I 
Hel, Seid me RE Nou HEL, \o, LDHB 
| LDH NM] ALT | 
OH fe) NH, 


Extended Data Fig. 1|See next page for caption. 


Extended Data Fig. 1| Plasma enrichment of isotopically labelled 
metabolites after infusion into xenografted mice. Related to Fig. 1.a, 
Summary of the melanomas used in this study and their spontaneous 
metastatic behaviour after subcutaneous transplantation into NSG mice. 
Melanomas were characterized as inefficient or efficient metastasizers. Before 
subcutaneous tumours grew to 2.5 cm in diameter (when the mice were killed 
per approved protocol), inefficient metastasizers rarely formed 
macrometastases or micrometastases beyond the lung, whereas efficient 
metastasizers commonly formed macrometastases as well as micrometastases 
in several organs (data reflect results from one to five independent 
experiments per melanoma). Some of these data have been published 
previously”. b-g, Isotope tracing was performed in NSG mice subcutaneously 
xenografted with efficiently metastasizing melanomas from four patients 
(M405, M481, M487 and UT10) and inefficiently metastasizing melanomas 
from nine patients (M715, UM17, UM22, UM43, UM47, M498, M528, M597 and 
M610). The number of tumours or mice analysed per treatment is indicated. 


b, Glutamine m+ 5asa fraction of total plasma glutamine in mice infused 

with [U-¥C]glutamine (14 independent experiments). c, Isotope enrichment in 
subcutaneous tumours after [U-¥C] glutamine infusion (14 independent 
experiments). d, Glucose m+6asa fraction of total plasma glucose in mice 
infused with [U-°C]glucose (20 independent experiments). e, Plasma glucose 
and lactate concentrations before and after infusion. f, Lactatem+3asa 
fraction of total plasma lactate in mice infused with [U-?C]lactate (23 
independent experiments). g, Lactate m+1asa fraction of total plasma lactate 
in mice infused with [2H]lactate (three independent experiments). 

h, Expected isotope labelling after [2-*H]lactate infusion. i, Western blot 
analysis of LDHA and LDHBin subcutaneous tumours from NSG mice 
xenografted with efficiently (M405, M481and UT10) or inefficiently (UM17, 
UM43 and UM47) metastasizing melanomas (representative of four 
independent experiments). Data are mean +s.d. Statistical significance was 
assessed using Mann-Whitney tests (c) and ¢-tests at 180 or 300 min when 
tumours were obtained for analysis (b, d, f, g) or paired f-tests (e). 
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Extended Data Fig. 2| See next page for caption. 
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Extended Data Fig. 2| Efficient metastasizers express higher levels of MCT1 
than inefficient metastasizers. Related to Fig. 2.a, Quantification of MCT1 
relative to actin bands from the western blot in Fig. 2a comparing efficient and 
inefficient metastasizers. b, Quantification of MCT4 relative to actin bands 
from the western blot in Fig. 2c comparing efficient and inefficient 
metastasizers. c,d, Quantification of mean fluorescence intensities for MCT1 
staining in the flowcytometry plots comparing efficient (Fig. 2e) and 
inefficient (Fig. 2d) metastasizers. HCC15 cells and MCT1-deficient HCC15 cells 
were positive and negative controls (c). e, f, Immunofluorescence staining for 
MCT1 (green) in sections from subcutaneous tumours from inefficiently 

(e, UM47) or efficiently (f, M405) metastasizing melanomas. An adjacent 
section was stained with an antibody against the melanoma marker S100b 


(green). Images are representative of three independent experiments per 
melanoma. g,h, Immunofluorescence staining for MCT1 (green) insections 
from subcutaneous tumours from inefficient (g; M498, M610 and M597) and 
efficient (h; M481, UT10 and M405) metastasizers. In each case, an adjacent 
section was stained with an antibody against S100b (green). Images are 
representative of results from two independent experiments per melanoma. 
i,j, Although efficient metastasizers often exhibited cell-surface staining (j), 
inefficient metastasizers typically exhibited diffuse cytoplasmic staining (i). 
Images are representative of results from two independent experiments per 
melanoma. Data are mean +s.d. Statistical significance was assessed using 
Student’s t-tests (a, b, d). 
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Extended Data Fig. 3 | See next page for caption. 


Extended Data Fig. 3 | MCT1 inhibition impairs metastasis without altering 
MCTI, CD147, CD98 or B, integrin expression levels. Related to Fig. 2.a-c, 
Western blot analysis of MCT1 (a), MCT4 (b) and CD147 (c) insubcutaneous 
tumours versus metastatic liver, kidney and pancreas nodules from NSG mice 
transplanted with three melanomas. d-g, Flowcytometry histograms of anti- 
MCT1(d,e) or anti-CD147 (f, g) staining in melanoma cells from subcutaneous 
tumours or metastatic nodules from mice transplanted with M405 (d, f) or 
M481 (e, g) melanomas. h-o, Flowcytometry histograms and mean 
fluorescence intensities of anti-MCT1 (h, i), anti-CD147 (j,k), anti-CD98 (I, m) or 
anti-B,-integrin (n, o) staining in melanoma cells from subcutaneous tumours 


treated with DMSO (control; black) or AZD3965 (MCT1 inhibitor; blue). The 
number of tumours or mice analysed in each treatment is indicated (two to 
three experiments). Inall flow cytometric analyses, human melanoma cells 
were distinguished from mouse cells based on positivity for HLA-ABC and 
DsRed and negativity for mouse CD31, CD45 and Ter119 staining (see Extended 
Data Fig. 9e, f for gating strategy). p-u, Western blot analysis of IKKa (p-r) and 
IKKf (s-u) in subcutaneous tumours from NSG mice treated with DMSO or 
AZD3965. Dataare mean ¢+s.d. Statistical significance was assessed with two- 
way ANOVA (i,k, m, 0). 


Article 


a b Cc 
ka M481 ~ 
iotrs | ali Batis er NE M481 DMSO _—_AZD 
a E-Cadherin N-Cadherin a ‘i 
Imentin 
keratinocytes —— 7X 
DMSO E-Cadherin N-Cadherin Actin 
AZD E-Cadherin A N-Cadherin 
—E-Cadherin-FITC—> |= —N-Cadherin-PE/Cy7-> 
, M405, M405 M405 __ DMSO AZD 
‘Sonal | A contol |\, serves a ae 
human E-Cadherin N-Cadherin | Vimentin 
keratinocytes = _ > See 
Acti 
DMSO E-Cadherin A N-Cadherin ia! 
AZD E-Cadherin yx N-Cadherin 
——E-Cadherin-FITC—> —N-Cadherin-PE/Cy7> 
: UT1 1G, UT10 UT10 __pmso AZD 
‘cortfol SSS SORES 1294 23 
human E-Cadherin N-Cadherin| Vimentin 
keratinocytes ee - Veew 
DMSO E-Cadherin 7, ~S N-Cadherin a 
AZD E-Cadherin N-Cadherin 
—E-Cadherin-FITC—> —N-Cadherin-PE/Cy7-> 


Extended Data Fig. 4 | MCT1inhibition with AZD3965 impairs metastasis 
without altering markers of epithelial-to-mesenchymal transition. Related 
to Fig. 2.a,b, Flowcytometry histograms of anti-E-cadherin (a) and anti- 
N-cadherin (b) staining in melanoma cells from subcutaneous tumours of mice 
treated with DMSO (control) or AZD3965. Human keratinocytes were included 
asacontrolin each case as they are known to include subpopulations of 


E-cadherin- and N-cadherin-positive cells. In xenografts, human melanoma 
cells were distinguished from mouse cells based on positivity for HLA-ABC and 
DsRed and negativity for mouse CD31, CD45 and Terl119 staining (see Extended 
Data Fig. 9e, f for gating strategy). Data are representative of two to three mice 
analysed in two independent experiments. c, Western blot analysis of vimentin 
in subcutaneous tumours from NSG mice treated with DMSO or AZD3965. 
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Extended Data Fig. 5| Representative images of the bioluminescence from each mouseat end point and imaged to identify macrometastases and 
analysis of visceral organs to determine metastatic disease burden at end micrometastases and to determine bioluminescence signal intensity. Each 
point. Related to Figs. 2,3 and 5.a-e, Visceral organs were surgically removed melanoma was tagged with constitutive luciferase expression. 
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Extended Data Fig. 6|See next page for caption. 
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Extended Data Fig. 6 |sShRNA-mediated knockdown of MCT1 inhibits 
melanoma metastasis in vivo. Related to Fig. 2.a, b, Western blot analysis of 
MCT1 (a) or MCT4 (b) insubcutaneous tumours from mice xenografted with 
efficiently metastasizing melanomas transfected with scrambled control 
shRNA or with two different shRNAs (land 2) against MCT1 (a) or MCT4 (b). 
Wild-type HCC15 cells were used as a positive control for MCT1(WT, a) and 
MCT4 (WT, b) and MCT1-deficient (KO, a) or MCT4-deficient (KO, b) HCC15 cells 
were used as a negative control (representative of two independent 
experiments). c—e, Growth of subcutaneous tumours (c) in mice transplanted 
with melanomas transfected with scrambled control shRNA or shRNAs (sh1 and 
sh2) against MCT1. The number of mice analysed in each treatment is indicated 
in (one experiment per melanoma). The frequency of circulating melanoma 
cells in the blood (d) and metastatic disease burden based on bioluminescence 
imaging (e) inthe same mice were determined. f, Western blot analysis of MCT1 


in subcutaneous tumours transfected with scrambled control shRNA or 
shRNAs against MCT1, with (MCT1-OE) or without an shRNA-insensitive MCT1 
cDNA.g,h, Growth of subcutaneous tumours (g) and metastatic disease 
burdenat end point (h) in mice transplanted with melanomas transfected with 
scrambled control shRNA or shRNAs against MCT1 and an shRNA-insensitive 
MCTI cDNA. i, Fold change in mean fluorescence intensity for CellRox DeepRed 
staining (ROS) in xenografted melanoma cells with scrambled control shRNA 
or shRNAs against MCT1 treated with AZD3965 or DMSO. Data are mean +s.d.. 
Statistical significance was assessed using nparLD followed by Benjamini- 
Hochberg’s multiple comparisons adjustment (c), log,-transformed one-way 
ANOVAs with Holm-Sidak’s multiple comparisons adjustment (d, e, h), mixed- 
effects analysis followed by Dunnett’s multiple comparisons adjustment (g) or 
log,-transformed two-way ANOVA with Sidak’s multiple comparisons 
adjustment (i). 
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Extended Data Fig. 7| CRISPR deletion of MCT1 from mouse melanoma cells 
impairs metastasis, whereas MCTI1 overexpression in patient-derived 
xenografts increases metastasis. Related to Fig. 2.a, Western blot analysis of 
MCT1in wild-type parental YUMM1.7 melanoma cells as well as two lines from 
which MCT1 had been deleted using CRISPR (KO #1 and #2). b-d, Growth of 
subcutaneous tumours (b), total metastatic disease burden at end point by 
bioluminescence imaging of visceral organs (c) and CellRox DeepRed staining 
of subcutaneous tumour cells (d). The number of mice analysed in each 
treatment is indicated (one experiment; note that one mouse died inthe KO #2 
treatment before end-point analysis). e, Western blot analysis of MCTlinan 
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inefficiently metastasizing melanoma (UM47) expressing MCTI cDNA. 

f, g, Growth of subcutaneous tumours (f) and total metastatic disease burden 
at end point by bioluminescence imaging of visceral organs (g) from mice 
transplanted with these melanomas (one experiment; note that two mice died 
inthe control treatment before end-point analysis). Dataare mean+s.d. 
Statistical significance was assessed using one-way ANOVA followed by 
Dunnett’s multiple comparison adjustment (b, day 25) or log,-transformed 
one-way ANOVAs followed by Dunnett’s multiple comparisons adjustment 
(c,d), t-test (f, day 90) or log,-transformed f-test (g). 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | MCT1inhibition does not impair the migration of 
melanomacellsin culture but appears to reduce metastatic disease burden 
by killing metastasizing melanoma cells in vivo. Related to Fig. 2.a, Migration 
in transwell invasion assays of three melanomas treated with DMSO (control) or 
AZD3965 (MCT1Linhibitor), including representative images (left) and counts 
(right) of the cells that migrated across the insert after 24 h (one experiment 
with two to three replicate cultures per melanoma). b,c, Effect of acute 
treatment with AZD3965 (7 days) on the diameter of subcutaneous tumours, 
the frequency of circulating melanoma cells in the blood and metastatic 
disease burden in mice with established M481 (b) or M405 (c) melanomas. 
Treatment was initiated when the subcutaneous tumours reached 2cmin 
diameter (one experiment per melanoma with three mice per treatment). 

d, Efficiently metastasizing melanoma cells (M405) were subcutaneously 


transplanted into mice and allowed to spontaneously metastasize; then the 
primary tumours were resected to prolong survival and to allow the metastatic 
tumours that had formed before primary tumour resection to grow larger. Mice 
were treated with AZD3965 for the duration of the experiment, only before 
primary tumour resection, or only after primary tumour resection. e, Analysis 
of total metastatic disease burden at end point showing that metastatic disease 
burden was reduced when AZD3965 treatment was performed before primary 
tumour resection, during the time when melanoma cells were spontaneously 
metastasizing, but before metastatic tumours were established. The number 
of mice per treatment is shown (two independent experiments). Dataare 

mean +s.d. Statistical significance was assessed using two-way ANOVA 
followed by Dunnett’s multiple comparison’s adjustment (a), t-tests (b,c) or 
Kruskal-Wallis test followed by Dunn’s multiple comparison’s adjustment (e). 
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Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | Increased MCT1 expression in melanomas is 
associated with significantly worse patient survival. Related to Fig. 2.a-d, 
Kaplan-Meier overall survival curves of patients with melanoma stratified 
based onexpression levels of MCT1 (a), MCT2(b), MCT4 (c) and CD147 (d) within 
tumour specimens. Data are from the SKCM cohort in TCGA (https://portal. 
gdc.cancer.gov/projects/TCGA-SKCM). Each panel compares the top third of 
patients with the highest expression levels versus the bottom third of patients 
with the lowest expression levels. Ticks represent censored values. e, f, Flow 
cytometry plots showing the gating strategies used to identify human 


melanoma cells in subcutaneous tumours (e) or the blood (f) of xenografted 
mice. Cells were gated on forward versus side scatter (FSC-A versus SSC-A) to 
exclude red blood cells and clumps of cells. Human melanoma cells were 
selected by including cells that stained positively for DsRed (stably expressed 
in all melanoma lines) and HLA and excluding cells that stained positively for 
the mouse haematopoietic and endothelial markers CD45, CD31 or Ter119. 
Statistical significance of the differences in overall survival (a-d) was assessed 
using the Mantel-Cox log-rank test. 
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Extended Data Fig. 10 | See next page for caption. 
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Extended Data Fig. 10 | MCT1inhibition reduces the levels of PPP, but not 
glycolytic, metabolites. Related to Figs. 3-5.a, The GSH to GSSG ratios in 
melanoma cells from mice treated with AZD3965 or DMSO (two independent 
experiments per melanoma). b, Quantitative analysis of NADPH and NADP" in 
melanoma cells from mice treated with AZD3965 or DMSO (one or two 
experiments per melanoma). Liver cells were included as a control, withahigh 
NADPH/NADP* ratio. c, Expected isotope-labelled species after infusion of 
[1,2-°C]glucose. d, Glucose m + 2asa fraction of total plasma glucose in mice 
xenografted with efficiently metastasizing melanomas (M405, M481and 
UT10), treated with DMSO or AZD3965 and infused with [1,2-"C] glucose. 

e, Glucose m+ 6asa fraction of total plasma glucose in mice infused with [U-¥C] 
glucose. The number of mice per treatment is indicated (two independent 


experiments). f-i, LC-MS measurement of the levels of glycolytic (f, h) and 
oxidative PPP (g, i) metabolites in subcutaneous tumour cells from mice 
xenografted with melanomas treated with DMSO (control) or AZD3965 (MCT1 
inhibitor) for 7 days.j, Flow cytometrically isolated MCT1"2" or MCTI/'™ 
melanoma cells were subcutaneously transplanted into NSG mice, using 10 or 
100 cells per injection. All injections formed tumours. Rate of growth of the 
tumours initiated with 10-cell injections. Data are mean +s.d. Statistical 
significance was assessed using t-tests (a), repeated-measures two-way 
ANOVAs (b), t-test (e, 180 min), log,-transformed two-way ANOVAs (f, h), 
log,-transformed t-tests (g, M405 and UT10), Mann-Whitney test (g, M481 
andi, M481), Welch’s t-tests (i, M405 and UT10) or using nparLD test (d,j). 
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Statistical parameters 


When statistical analyses are reported, confirm that the following items are present in the relevant location (e.g. figure legend, table legend, main 
text, or Methods section). 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


An indication of whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistics including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND 
variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 
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Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 
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State explicitly what error bars represent (e.g. SD, SE, Cl) 
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Policy information about availability of computer code 


Data collection Flow cytometry data were collected using BD FACSDiva 8.0, Bioluminescence data were collected using Living Image software V4.3.1, 
C-MS data were collected using Agilent ChemStation E02.02.1431, LC-MS/MS data were collected using SCIEX Analyst v1.6.3 and 
hermo Scientific XCalibur 4.1.50. Immunofluorescence data were collected using Zeiss ZEN 2.3 software. 
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Data analysis raphpad Prism 8 and R 3.5.1 with the stats, fBasics, car, and nparLD packages, Flow cytometry data analysis using BD FACSDiva 8.0, and 
owJo V10 (Treestar), Bioluminescence data were analyzed using Living Image software V4.3.1. GC-MS data analysis using Agilent 


G 
F 
ChemStation E02.02.1431. LC-MS/MS data analysis using SCIEX Multiquant v2.1.1, Thermo Scientific Compound Discoverer 3.0 and 
T 
d 


hermo Scientific Trace Finder 4.1, Immunofluorescence data were analyzed using Bitplane Imaris V9.2.1 software, Western blot 
ensities were quantified using ImageJ 1.52k. 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers 
upon request. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 
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All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
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Data supporting the findings of this study are available within the article and its Supplementary Information files ore from the corresponding author on request. 
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For a reference copy of the document with all sections, see nature.com/authors/policies/ReportingSummary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size Samples sizes were not pre-determined based on statistical power calculations but were based on our experience with these assays. For 
assays in which variability is commonly high, we typically used n>10. For assays in which variability is commonly low, we typically used n<10. 


Data exclusions No data were excluded; however, mice sometimes died during experiments, presumably due to the growth of metastatic tumors. In those 
instances, data that had already been collected on the mice in interim analyses were included (such as subcutaneous tumor growth 
measurements over time) even if it was not possible to perform the end-point analysis of metastatic disease burden (due to the premature 
death of the mice). 


Replication The experimental findings were reproduced in multiple independent experiments. The number of independent experiments and biological 
replicates for each data panel is indicated in the figure panel itself, in the figure legends, and in the source data files. Data shown in the figures 
represent the aggregate of all independent experiments in most cases. Data shown in a minority of panels are from a representative 
experiment (e.g. for western blots) and in those cases the number of independent experiments that reproduced the finding is also indicated in 
the figure legends. Tumor growth curves also show representative experiments because it is difficult to combine together tumor growth curve 
data from multiple different experiments. 


Randomization — No formal randomization techniques were used; however, samples were allocated randomly to experiments and processed in an arbitrary 
order. 


Blinding During all isotope tracing experiments, the data were analyzed in a manner blinded to sample identity or treatment. A.T. performed all of the 
infusions, collected tumor specimens, and performed mass spectrometry, then passed the de-identified data files to B.F. and A.S., who 
analyzed the isotope tracing patterns. After the patterns had been analyzed for individual mice, the samples were re-identified so the results 
could be interpreted. 


Reporting for specific materials, systems and methods 


Materials & experimental systems Methods 

n/a | Involved in the study n/a | Involved in the study 
Unique biological materials | ChIP-seq 
Antibodies Flow cytometry 
Eukaryotic cell lines MRI-based neuroimaging 


Palaeontology 


Animals and other organisms 


Human research participants 


Antibodies 


Antibodies used The following antibodies have been used in this study: 


Validation 


Anti-Mouse CD31 (PECAM-1) eFlour 450, clone:390, Cat. #48-0311-82, LOT:1982691 
ebioscience, 1:100, Flow 


Anti-Mouse TER-119, VioletFluor450, clone:TER-119, Cat. #75-5921-U100, LOT:C5921081018753 
Tonbo, 1:100, Flow 


Anti-Mouse CD45, VioletFluor450, clone:30-F11, Cat. #75-0451-U100, LOT:C0451033117753 
Tonbo, 1:100, Flow 


Mouse Anti-Human HLA-ABC, FITC, clone:G46-2.6, Cat. #555552, LOT:8183993 
BD Pharmingen, 1:20, Flow 


Anti-Human MCT1, Alexa Fluor 647, clone:bs-10249R, Cat. #bs-10249R-A647, LOT:AE112116 
Bioss antibodies, 1:100, Flow 


Anti-Human CD147, Alexa Fluor 488, clone:HIM6, Cat. #306207, LOT:B213982, BioLegend, 1:100, Flow 
Anti-Human CD98, PE-Vio 770, clone:REA387, Cat. #130-105-710, LOT:5170207052, Miltenyi Biotec, 1:200, Flow 
Anti-Human beta 1-Integrin, Alexa Fluor 700, clone:P5D2, Cat. #FAB17781N-100UG, LOT:1529055, R&D systems, 1:100, Flow 


Anti-Human CD324 (E-Cadherin), FITC, clone:67A4, Cat. #324104, LOT:B203125, BioLegend, 1:100, Flow 


Anti-Human CD325 (N-Cadherin), PE/Cy7, clone:8C11, Cat. #350811, LOT:B272631, BioLegend, 1:100, Flow 
Anti-Rabbit IgG, isotype control, Alexa Fluor 647, Cat. #bs-0295P-A647, LOT:AGO726809, Bioss antibodies,1:100, Flow 
Anti-Mouse IgG1, k, isotype control, Alexa Fluor 488, clone: MOPC-21, Cat. #400129, LOT:B220820, BioLegend, 1:100, Flow 


Anti-Human |gG1, isotype control, PE-Vio 770, clone:REA293, Cat. #130-113-452, LOT:5190329516, Miltenyi Biotec,1:100, Flow 


anti-Mouse IgG1,k, isotype control, Alexa Fluor 700, clone:11711, Cat. #ICOO2N, LOT:ACIJ0418111, R&D systems, 1:100, Flow 
Anti-MCT1, clone:HPA003324, Cat. #HPA003324-100UL, LOT:C75340, Sigma, 1:500, IF 


Anti-Rabbit IgG, biotinylated, Cat. #PK-6104, Vector Laboratories, 1:250, IF 


Anti-Rabbit IgG, peroxidase conjugated, Cat. #PK-6101, Vector Laboratories, 1:250, IF 
Anti-S100, Cat. #Z0311, polyclonal, LOT:00060051, Dako, 1:500, IF 
Donkey anti-Rabbit-lgG, Alexa Fluor 488 AffiniPure F(ab')2 fragment, Cat. #711-545-152, JacksonImmuno, 1:250, IF 


Donkey anti-rat IgG, Cy3-AffiniPure F(ab')2 fragment, Cat. #712-166-150, JacksonImmuno, 1:250, IF 


Anti-MCT1, Cat. #AB3538P, LOT:3190916, EMD Millipore, 1:5000, WB 


Anti-MCT2, Cat. #LN2021159, LOT:5653586301013, LabNed, 1:5000, WB 


Anti-MCT4, Cat. #4B3316P, LOT:2972442, EMD Millipore, 1:5000, WB 


Ant-CD147, Cat. #ab64616, LOT:GR205454-1, Abcam, 1:10000, WB 


Anti-Vimentin, clone:D21H3, Cat. #5741T, LOT:6 , Cell Signaling, 1:10000, WB 
Anti-IKK-alpha, clone:D3WE6N, Cat. #61294S, LOT:1, Cell Signaling, 1:5000, WB 
Anti-IKK-beta, clone:D30C6, Cat. # 8943S, LOT:4, Cell Signaling, 1:5000, WB 

Anti-LDHA, clone:C4B5, Cat. #3582 LOT:9, Cell Signaling, 1:10000, WB 

Anti-LDHB, clone:EP1565Y, Cat. #ab53292, LOT:GR103088-10, Abcam, 1:10000, WB 
Anti-alpha Tubulin, clone:EP1332Y, Cat. #ab52866, LOT:GR3241238-1, Abcam, 1:10000, WB 


Anti-beta-Actin, clone:D6A8, Cat. #12620S, LOT:6, Cell Signaling, 1:10000, WB 


All antibodies are commercially available and have been validated in previously published studies (e.g. Nature 527:186). 
Antibodies that were central to our conclusions, such as the anti-MCT1 antibodies, were validated with control lines (that were 
positive or deficient for MCT1) and similar results were obtained using multiple independent antibodies. 


Anti-Mouse CD31 (Cat. #48-0311-82, ebioscience). This monoclonal antibody recognizes mouse CD31. https:// 
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www.thermofisher.com/antibody/product/CD31-PECAM-1-Antibody-clone-390-Monoclonal/48-0311-82 


Anti-Mouse TER-119 (Cat. #75-5921-U100, Tonbo). This monoclonal antibody recognizes mouse TER-119. https:// 
www.tonbobio.com/violetfluortm-450-anti-mouse-ter-119-ter-119. html 


Anti-Mouse CD45 (Cat.#75-0451-U100, Tonbo). This monoclonal antibody recognizes mouse CD45. https://www.tonbobio.com/ 
violetfluor-mouse-cd45-30-f11.html 


Mouse Anti-Human HLA-ABC (Cat. #555552, BD Pharmingen). This monoclonal antibody binds to a monomorphic epitope on the 
human alpha chain of HLA-A, HLA-B and HLA-C. www.bdbiosciences.com/us/applications/research/stem-cell-research/ 
mesenchymal-stem-cell-markers-adipose/human/positive-markers/fitc-mouse-anti-human-hla-abc-g46-26/p/555552 


Anti-Human MCT1 (Cat. #bs-10249R-A647, Bioss antibodies). This polyclonal antibody recognizes extracellular epitopes of MCT1. 
Species reactivity - Human, Mouse and Rat. We tested this antibody on MCT1-deficient cells (Fig. 2f) and found it to be specific 
for MCT1. https://www.biossusa.com/products/bs-10249R-A647 


Anti-Human CD147 (Cat. #306207, BioLegend). This monoclonal antibody recognizes human CD147. Species reactivity - Human. 
https://www.biolegend.com/en-us/products/alexa-fluor-488-anti-human-cd147-antibody-3367 


Anti-Human CD98 (Cat. #130-105-710, Miltenyi Biotec). This monoclonal antibody recognizes the human CD98 antigen. https:// 
www.miltenyibiotec.com/US-en/products/macs-flow-cytometry/antibodies/primary-antibodies/cd98-antibodies-human- 
rea387-1-11.html?utm_source=3rd_labome%20&utm_medium=product_listing&utm_term=flow- 
cytometry&utm_campaign=reafinity 


Anti-Human beta 1-Integrin (Cat. #—AB17781N-100UG, R&D systems). This monoclonal antibody recognizes human Integrin beta 
1/CD29. Species reactivity - Human. https://www.rndsystems.com/products/human-integrin-beta-1-cd29-alexa-fluor-700- 
conjugated-antibody-p5d2_fab17781n 


Anti-Human CD324 (E-Cadherin) (Cat. # 324104, BioLegend). This monoclonal antibody recognizes human CD324 also known as 
E-cadherin. Species reactivity - human. https://www.biolegend.com/en-us/products/fitc-anti-human-cd324-e-cadherin- 
antibody-3750 


Anti-Human CD325 (N-Cadherin) (Cat. #350811, BioLegend). This monoclonal antibody recognizes human CD325, also known as 
N-cadherin. Species reactivity - human. https://www.biolegend.com/en-us/products/pe-cy7-anti-human-cd325-n-cadherin- 
antibody-9041 


Rabbit IgG, isotype control (Cat. #bs-0295P-A647, Bioss antibodies). This polyclonal antibody is designed to serve as a control to 
account for non-specific staining by primary antibodies that is caused by Fc Receptor-mediated binding or other isotype-specific 
mechanisms. It was conjugated with Alexa Fluor 647. https://www.biossusa.com/products/bs-0295p-a647 


Mouse IgG1k, isotype control (Cat. # 400129, BioLegend). This monoclonal antibody is designed to serve as a control to account 
for non-specific staining by primary antibodies that is caused by Fc Receptor-mediated binding or other isotype-specific 
mechanisms. It was conjugated with Alexa Fluor 488. https://www.biolegend.com/en-us/products/alexa-fluor-488-mouse-igg1-- 
kappa-isotype-ctrl-fc-2687 


Human |gG1, isotype control (Cat. # 130-113-452, Miltenyi Biotec). This monoclonal antibody is designed to serve as a control to 
account for non-specific staining by primary antibodies that is caused by Fc Receptor-mediated binding or other isotype-specific 
mechanisms. It was conjugated with PE-Vio 770. https://www.miltenyibiotec.com/US-en/products/macs-flow-cytometry/ 
antibodies/isotype-control-antibodies/rea-control-antibodies-rea293-1-50.html 


Mouse IgG1k, isotype control (Cat. # ICOO2N, R&D systems). This monoclonal antibody is designed to serve as a control to 
account for non-specific staining by primary antibodies that is caused by Fc Receptor-mediated binding or other isotype-specific 
mechanisms. It was conjugated to Alexa Fluor 700. https://www.rndsystems.com/products/mouse-igg-1-alexa-fluor-700- 
conjugated-antibody_ic002n 


Anti-human MCT1 (Cat. #HPAO03324-100UL, Sigma). This polyclonal antibody recognizes human MCT1. This antibody has been 
used previously to identify MCT1 in human cells (Cell Stem Cell. 20:635). This antibody was validated by the Human Protein Atlas 
HPA) project. https://www.proteinatlas.org/ENSG00000155380-SLC16A1. 
https://www.sigmaaldrich.com/catalog/product/sigma/hpa003324?lang=en&region=US 


Anti-Rabbit IgG, biotinylated (Cat. #PK-6104, Vector Laboratories). 
The biotinylated anti-rabbit IgG recognizes specifically rabbit IgG primary antibodies. 
https://vectorlabs.com/vectastain-elite-abc-kit-rat-igg.html 


Anti-Rabbit IgG, peroxidase conjugated (Cat. #PK-6101, Vector Laboratories). 
The peroxidase anti-rabbit IgG recognizes specifically rabbit IgG primary antibodies. https://vectorlabs.com/vectastain-elite-abc- 
kit-rabbit-igg.html 


Anti-S100 (Cat. #Z0311, Dako). This polyclonal antibody strongly reacts with human S100B, and weakly or very weakly with 
S100A1 and S100A6, respectively. This antibody has been used previously to identify S100 expression in melanoma cells (Nature 
527:186). https://www.agilent.com/en/product/immunohistochemistry/antibodies-controls/primary-antibodies/s100-(dako- 
omnis)-76198 


Donkey anti-Rabbit-lgG, Alexa Fluor 488 AffiniPure F(ab')2 fragment (Cat. #711-545-152, Jacksonimmuno). This polyclonal 
antibody binds to rabbit IgG heavy and light chains. It also reacts with the light chains of other rabbit immunoglobulins. https:// 
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www.jacksonimmuno.com/catalog/products/711-545-152 


Donkey anti-rat IgG, Cy3-AffiniPure F(ab')2 fragment (Cat. #712-166-150, JacksonImmuno). This polyclonal antibody binds to 
rabbit IgG heavy and light chains. It also reacts with the light chains of other rat immunoglobulins. https:// 
www.jacksonimmuno.com/catalog/products/712-166-150 


Anti-MCT1 (Cat. #AB3538P, EMD Millipore). 
This antibody recognizes human MCT1. This polyclonal antibody has been used previously to identify MCT1 expression in cancer 
cells by western blotting (Cancer Res. 77:5591). We tested the specificity of this antibody by staining MCT1-deficient cells in Fig. 
2a and ED Fig. 5a by western blot. http://www.emdmillipore.com/US/en/product/Anti-Monocarboxylate-Transporter-1- 

Antibody, MM_NF-AB3538P 


Anti-MCT2 (Cat. #LN2021159, LabNed). This polyclonal antibody recognizes epitopes of human MCT2. This antibody was tested 
with a known positive control (MCF7 protein, Nat Chem Biol. 14:1032, Fig.1e) in Fig. 2b of this manuscript. https://labned.com/ 
slc16a7-human-unlb-antibody-In2021159 


Anti-MCT4 (Cat. #AB3316P, EMD Millipore). This polyclonal antibody recognizes human MCT4. This antibody has been used 
previously to identify MCT4 staining in human cancer cells by western blotting (Cancer Res. 77:5591). We independently tested 
the specificity of this antibody by staining positive and negative control cells in Fig. 2c and ED Fig. 5b of this manuscript by 
western blot. https://www.emdmillipore.com/US/en/product/Anti-Monocarboxylate-Transporter-4-Antibody, MM_NF-AB3316P 
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Anti-CD147 (Cat. #ab64616, Abcam). This polyclonal antibody recognizes human and mouse CD147. The manufacturer validated 
the antibody in western blots using synthetic peptides. https://www.abcam.com/cd147-antibody-ab64616.html 


Anti-Vimentin (Cat. #5741T, Cell Signaling). This monoclonal antibody recognizes human, mouse, rat, and monkey vimentin 
proteins. https://www.cellsignal.com/products/primary-antibodies/vimentin-d21h3-xp-rabbit-mab/5741 


Anti-IKK-alpha (Cat. #61294S, Cell Signaling). This monoclonal antibody recognizes human, mouse, and rat IKKa proteins. 
https://www.cellsignal.com/products/primary-antibodies/ikka-d3w6n-rabbit-mab/61294?site-search 
type=Products&N=4294956287&Ntt=61294s&fromPage=plp&_requestid=3523507 


Anti-IKK-beta(Cat. #89435, Cell Signaling). 
This monoclonal antibody recognizes total IKKB protein from human, mouse, rat, and monkey but does not cross-react with 
other IKK family members. https://www.cellsignal.com/products/primary-antibodies/ikkb-d30c6-rabbit-mab/8943 


Anti-LDHA (Cat. #3582, Cell Signaling). This monoclonal antibody recognizes endogenous levels of total LDHA protein. Species 
Reactivity- Human, Monkey. https://www.cellsignal.com/products/primary-antibodies/Idha-c4b5-rabbit-mab/3582 


Anti-LDHB (Cat. #ab53292, Abcam). This monoclonal antibody recognizes total human LDHB protein and the specificity was 
validated by the manufacturer using LDHB-deficient HAP1 cells. 
https://www.abcam.com/lactate-dehydrogenase-bldh-b-antibody-ep1565y-ab53292.html?productWallTab=ShowaAll 


Anti-Alpha Tubulin (Cat. #ab52866, Abcam). This monoclonal antibody recognizes mouse, rat, human, pig, and drosophila alpha 
Tubulin. https://www.abcam.com/alpha-tubulin-antibody-ep1332y-microtubule-marker-ab52866.html 


Anti-beta-Actin (Cat. #126208, Cell Signaling). This monoclonal antibody 

recognizes total B-actin protein in human, mouse, rat, monkey, drosophila, and zebrafish cells. 
https://www.cellsignal.com/products/antibody-conjugates/b-actin-d6a8-rabbit-mab-hrp-conjugate/12620?site-search 
type=Products&N=4294956287&Ntt=12620s&fromPage=plp&_requestid=3526589 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) YUMM1.7, YUMM3.3 and YUMMS.2 cell lines were purchased from ATCC. 
Authentication YUMM1.7, YUMM3.3 and YUMMS5.2 cell lines were authenticated by ATCC. 
Mycoplasma contamination All cell lines were confirmed to be mycoplasma free by MycoAlert detection kit (Lonza). 


Commonly misidentified lines No commonly misidentified cell lines were used. 
(See ICLAC register) 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals Four to 8-week-old NOD.CB17-Prkdcscid II2rgtm1Wjl/SzJ (NSG) and four to-8-week old C57/BL6 mice were used. Both male and 
female mice were used. 


Wild animals No wild animals were used. 


Field-collected samples 


No field-collected samples were used. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics 


Recruitment 


Flow Cytometry 


Our research did not involve human subjects, but relied upon patient melanoma specimens that had been collected and 
described in prior studies cited in the paper. The melanoma specimens were obtained with approval by the Institutional Review 
Board of the University of Michigan Medical School (IRBMED approvals HUM00050754 and HUMO00050085) and the University 
of Texas Southwestern Medical Center (IRB approval 102010-051). They were shared with us as de-identified specimens. 


We did not recruit any patients. 


Plots 


Confirm that: 


Methodology 


Sample preparation 


Instrument 
Software 
Cell population abundance 


Gating strategy 


The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). 
The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers). 
All plots are contour plots with outliers or pseudocolor plots. 


A numerical value for number of cells or percentage (with statistics) is provided. 


Tumors were dissociated in Kontes tubes with disposable pestles (VWR) followed by enzymatic digestion for 20 min with 200 U/ 
ml collagenase IV (Worthington), 5mM CaCl2 and 50 U/ml DNase. To obtain a single-cell suspension, cells were filtered through a 
40 um cell strainer. 

BD FACS Aria Fusion (for cell sorting or analysis), BD Fortessa (for analysis). 

BD FACSDiva 8.0, FlowJo V10 

The abundance of the relevant cell populations within post-sort fractions was 90-100% in experiments. 

Human melanoma cells were isolated as cells that were positive for human HLA and negative for mouse endothelial and 


hematopoietic markers (mouse CD31, mouse CD45 and mouse TER119). To eliminate dead cells from sorts and analyses, cells 
were stained with 4’,6-diamidino-2-phenylindole (DAPI). 


Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 
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Modifications of histone proteins have essential roles in normal development and 
human disease. Recognition of modified histones by ‘reader’ proteins is a key 
mechanism that mediates the function of histone modifications, but how the 
dysregulation of these readers might contribute to disease remains poorly 
understood. We previously identified the ENL protein as a reader of histone 
acetylation via its YEATS domain, linking it to the expression of cancer-driving genes 
in acute leukaemia’. Recurrent hotspot mutations have been found inthe ENL YEATS 
domain in Wilms tumour?’, the most common type of paediatric kidney cancer. Here 
we show, using human and mouse cells, that these mutations impair cell-fate 
regulation by conferring gain-of-function in chromatin recruitment and 
transcriptional control. ENL mutants induce gene-expression changes that favour a 
premalignant cell fate, and, in an assay for nephrogenesis using murine cells, result in 
undifferentiated structures resembling those observed in human Wilms tumour. 
Mechanistically, although bound to largely similar genomic loci as the wild-type 
protein, ENL mutants exhibit increased occupancy at a subset of targets, leading toa 
marked increase in the recruitment and activity of transcription elongation 
machinery that enforces active transcription from target loci. Furthermore, 
ectopically expressed ENL mutants exhibit greater self-association and form discrete 
and dynamic nuclear puncta that are characteristic of biomolecular hubs consisting 
of local high concentrations of regulatory factors. Such mutation-driven ENL self- 
association is functionally linked to enhanced chromatin occupancy and gene 
activation. Collectively, our findings show that hotspot mutations ina chromatin- 
reader domain drive self-reinforced recruitment, derailing normal cell-fate control 


during development and leading to an oncogenic outcome. 


The eleven-nineteen-leukaemia protein (ENL) is a chromatin reader 
that maintains the oncogenic state in leukaemia’. ENL interacts with 
acetylated histone proteins via its well conserved YEATS (Yaf9, ENL, 
AF9, Tafl4, Sas5) domain, and, inso doing, helps to recruit and stabilize 
its associated transcriptional machinery to drive the transcription of 
target genes. Recently, somatic mutations in the ENL gene (also known 
as MLLT1) were found in about 5% of people with Wilms tumour, mak- 
ing ENL one of the most frequently mutated genes in this cancer type. 
These mutations are recurrent, heterozygous and highly clustered in 


the ENL YEATS domain. Interestingly, these ‘hotspot’ mutations all 
involve small in-frame insertions or deletions (Fig. laand Extended Data 
Fig. 1a). Whether and howsuch ENL mutations promote the formation 
of Wilms tumour was unclear and is the focus of our study. 


Impaired cell fate with ENL mutants 


To investigate the functional relevance of these ENL mutations, we 
created isogenic HEK293 (human embryonic kidney 293) and HK-2 
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Fig. 1| ENL mutations drive aberrant developmental programs and impair 
nephron differentiation. a, Bottom, the domain structure of the ENL protein. 
Top, the mutations found in the tumour mutants (T1to T3) compared with the 
wild-type (WT) protein sequence (in single-letter amino-acid code). The 
mutated regions are in red. IDR, intrinsically disorderedregion; AHD, 

ANC1 homologue domain. b,c, Heat map representation of genes that are 
differentially expressed in HEK293 (b) and HK-2 (c) cells expressing WT or 
mutant ENL (witha fold change of 1.5 or more, and false discovery rate (FDR) of 
0.01or less). Red and blue indicate relative high and low expression, 
respectively (Supplementary Tables 1, 2).d, Gene ontology (GO) analyses of 
upregulated genes (‘UP’) that are common to TI, T2and T3 mutant in HEK293 
cells (n=219 genes; Supplementary Table 3). P-values (-log,)P) were obtained 
by two-tailed Fisher exact test, adjusted by Bonferroni correction. e, Gene-set- 
enrichment analysis (GSEA) plots evaluating the changes in the indicated gene 


(human kidney-2) cells that stably expressed wild-type ENL or one 
of three distinct mutants (hereafter referred to tumour mutants, or 
T mutants) at equal levels (Fig. 1a and Extended Data Fig. 1b, c). The 
selected mutations include those that are most frequently observed 
in patients (T1)°, and represent both insertion (T1) and deletion (T2 and 
T3) mutational patterns (Fig. 1a). By comparison with the transcription 
induced by wild-type ENL, the transcriptional changes induced by 
all three mutants were remarkably similar and were highly enriched 
for pathways involved in developmental processes (Fig. Ib-d and 
Extended Data Fig. 1d, e). Notably, upon introduction of ENL mutants, 
there was a marked upregulation of genes that are enriched in embry- 
onic kidney progenitors and Wilms tumour (Fig. le and Extended Data 
Fig. 1f, g). These genes include developmentally critical genes such as 
HOXA genes? (Extended Data Fig. 1h, i). We observed a robust increase in 
the expression of HOXA genes when a mutant transgene was expressed 
at levels close to those of the endogenous ENL protein (Extended Data 
Fig. lj, k). We next expanded the analysis to other ENL YEATS mutations 
(Extended Data Fig. 1a) that have been found so far in Wilms tumour and 
leukaemia (T4). All eight ENL mutations tested (T1-T8) were capable of 
inducing the expression of key target genes (Extended Data Fig. 11, m), 
suggesting that they probably act through convergent mechanisms. 
Taken together, these results indicate that ENL YEATS mutations confer 
gain-of-function in transcription control and cause gene-expression 
changes that are involved in kidney differentiation and Wilms tumour. 

Wilms tumour is characterized by persistent embryonic kidney tis- 
sues and arrested cellular differentiation’. This, coupled with the tran- 
scriptional changes induced by ENL mutations, prompted us to examine 
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signatures (n= 366, 80, 95 genes from top to bottom; Supplementary Table 10) 
induced by the T1 mutant in HEK293 cells. NES, normalized enrichment score. 
f, Representative haematoxylin and eosin (H&E) staining of mESC-derived 
kidney structures. Green and red arrowheads point to nephric tubule and 
glomerulus, respectively. The yellow dashed line outlines a region of blastema. 
Control group, empty vector or WT ENL; mutant group, Tl, T2orT3. 

g, Quantification of the surface area of blastema components. Mean +s.e.m., 
one-sided Mann-Whitney ranked test; from left to right, n=3, 3, 4,4,4 
independent experiments. h, Representative immunofluorescence staining of 
induced kidney structures, labelling the nephric-tubule marker E-cadherin 
(green arrow) and the glomerular marker WT1 (pink arrow). The yellow dashed 
line outlines a region of blastema. DAPI, 4’,6-diamidino-2-phenylindole, a 
nuclear marker. Scale bars inf, hrepresent 50 pm. Datainf, hrepresent four 
independent experiments. 


the impact of ENL mutations on embryonic kidney differentiation. To 
this end, we adopted a well established three-dimensional nephrogene- 
sis assay’®. In this assay, nephron progenitors are first derived from mouse 
embryonic stem cells (mESCs), and then induced to undergo robust tub- 
ulogenesis upon co-culture with embryonic spinal cord (Extended Data 
Fig. 2a). We observed signature gene-expression changes’ at each step 
of the differentiation process (Extended Data Fig. 2b). These included 
Hoxa genes (for example, Hoxal11) peaking at the induced metanephric 
mesenchyme, which contains nephron progenitors. We also identified 
differentiated nephron structures, including proximal tubules, distal 
tubules and glomeruli (Extended Data Fig. 2c—e). Inthe presence of ENL 
mutants (Extended Data Fig. 2f), there was a marked increase in the pres- 
entation of structures that pathologically resemble undifferentiated 
blastema components in human Wilms tumour (Fig. 1f-h). Unlike the 
well differentiated epithelium, these undifferentiated components were 
highly proliferative, and expressed the mesenchymal marker protein 
vimentin (Extended Data Fig. 3a-i). They also retained the expression 
of the Wilms tumour-1 protein (encoded by the WT71 gene) (Fig. 1h)—a 
transcription factor that is normally expressed in nephron progenitors 
and glomerular podocytes’. Collectively, these results show that ENL 
mutations impair kidney cell differentiation and give rise to tumour- 
like structures, suggesting a role in the development of Wilms tumour. 


Enhanced chromatin occupancy by mutant ENL 


Next, we investigated the mechanisms by which ENL mutations drive 
aberrant gene expression. Given that these mutations are clustered in 
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Fig. 2| ENL mutations enhance chromatin occupancy by ENL and associated 
SEC complex to enforce active transcription. a, Heat map representation of 
ENL-bound chromatin peaks that show increased occupancy by T1, T2 and T3 
mutants compared with WT ENL (n=54; Supplementary Table 6) in HEK293 
cells. These heat maps are centred on ENL-bound peaks across a +5-kb window. 
The colour key represents the signal density, with darker colour representing a 
more intense signal. b, GSEA plots showing that genes (n= 87; Supplementary 
Table 10) with enhanced occupancy by ENL mutants are upregulated in mutant- 
expressing HEK293 cells. c, Genome browser view of Flag-ENL, CDK9 and Pol II 
phosphorylated serine 2 (S2P) ChIP-seq signals at selected ENL target genesin 
HEK293 cells. d, e, Box plots showing log, fold changes (T mutants versus WT) 
in CDK9 (d) or Pol II S2P (e) ChIP-seq signals at genes that have enhanced 


the YEATS domain, which is important for ENL to localize to chroma- 
tin’, we first investigated whether the genomic distribution of ENL is 
altered by the mutations. We performed chromatin immunoprecipita- 
tion followed by high-throughput DNA sequencing (ChIP-seq) experi- 
ments to map Flag-tagged wild-type or mutant ENL in HEK293 and HK-2 
cells. We found that all three mutant ENL proteins occupied largely 
similar genomic loci to wild-type ENL (Extended Data Fig. 4a-f), indi- 
cating that the mutations largely do not redistribute ENL to new target 
sites. Instead, each mutant exhibited enhanced occupancy at a subset 
of ENL target genes, and there was substantial overlap between the 
subsets occupied by each mutant, including the HOXA cluster (Fig. 2a 
and Extended Data Fig. 5a—e). Notably, increased occupancy of ENL 
mutants at these target genes correlated strongly with gene activation 
(Fig. 2b and Extended Data Fig. 5f). 

We next investigated how increased occupancy of ENL mutants leads 
to aberrant gene activation. ENL resides in large protein complexes 
that are involved in transcription activation, notably the super elonga- 
tion complex (SEC), elongation-assisting proteins (EAPs), and AFF1- 
ENL-P-TEFb complex (AEP), which contain overlapping subunits® ” 
(for simplicity, we refer only to ‘SEC’ hereafter), as well as the DOTIL 
complex”. The interaction of ENL with these complexes is mediated 
by ENL’s ANCI homologue domain (AHD) (Fig. 1a), and such interac- 
tions are not greatly affected by tumour mutations (Extended Data 
Fig. 6a). We then investigated whether the chromatin occupancy of 
these complexes is affected by ENL mutations. We first carried out 
ChIP-seq analyses to compare the binding of cyclin-dependent kinase 
9 (CDK9)—acomponent of the SEC complex that phosphorylates RNA 


Mutants UP Others 


Flavopiridol 


binding of ENL mutants (mutants_up, n=87; Supplementary Table 10) and at 
the other genes in the genome (others) in HEK293 cells. The indicated P-values 
were obtained by one-sided Wilcoxon signed-rank tests. For box plots, the 
centre lines represent the median; the box limits are the 25th and 75th 
percentiles; and the whiskers show the minimum to maximum values. 

f, Analysis of messenger RNA expression (normalized to GAPDH expression) 
from the indicated ENL-target genes in HEK293 cells that express the indicated 
ENL or vector constructs upon treatment with flavopiridol for 3 h. Increasing 
dosages (0,10 nM, 100 nM and1,000 nM) are depicted by grey wedge. 

Means +s.e.m.,n=3 technical replicates. Data represent two independent 
experiments. 


polymerase Il (Pol II) at the serine 2 site on its carboxy-terminal tail”. 
We observed a marked increase in CDK9 occupancy, preferentially at 
genomic loci that exhibit enhanced binding of ENL mutants (Fig. 2c, 
d and Extended Data Fig. 6b, d). We also detected increased levels of 
elongation-specific phosphorylation of Pol II serine 2 at these same 
sites (Fig. 2c, e and Extended Data Fig. 6c, e). In agreement with this 
mechanism of gene activation, we found that ENL-mutant-induced 
gene activation was abolished by treatment with flavopiridol (Fig. 2f 
and Extended Data Fig. 6g), which inhibits kinase activity of CDK9”. 

Bycontrast, we did not observea substantial changein DOT1IL-mediated 
dimethylation of lysine 79 on histone H3 proteins at the same ENL-target 
genes (Extended Data Fig. 6f). A recent study“ proposed that increased 
interaction of ENL with PAF1 underlies the effects of the ENL mutations 
found in Wilmstumour. However, contrary to this proposed model, we did 
not observe changes in PAF1 binding asa result of ENL mutations (Extended 
Data Fig. 7a). Moreover, depletion of PAF (Extended Data Fig. 7b, c) had 
minimal effects onthe chromatin occupancy of ENL mutants and the acti- 
vation of target genes (Extended Data Fig. 7d-f). Together, these results 
suggest that ENL YEATS mutations drive aberrant gene expression mainly by 
increasing chromatin occupancy by ENL andassociated SEC proteins. This 
observation prompted us to further investigate the mechanisms underlying 
the enhanced ENL chromatin occupancy driven by the mutations. 


Acylation binding required by ENL mutants 


Using the structure of the ENL YEATS domain’, we mapped the tumour 
mutations to a region that is distant from the acetyl-lysine-binding 
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Fig.3| Acylation-reading activity is required for enhanced chromatin 
occupancy by ENL T mutants. a, Structure (Protein DataBank code 5J9S) of the 
ENL YEATS domain (blue ribbon) bound toa histone H3 peptide comprising an 
acetylated lysine 27 residue (H3K27ac, yellow), showing a key ENL residue 

(Y78, green) that mediates recognition of histone acetylation and the region 
that is mutated in cancer (red). b, Genome browser view of the ChIP-seq signals 
from different Flag—ENL proteins at the genes indicated at the bottomin 
HEK293 cells. c, Box plots showing the fold change (FC) in Flag-ENL ChIP-seq 
signals (relative to wild-type ENL) at peak regions that bear enhanced 
occupancy of ENLT mutants (n=54) in HEK293 cells. Centre lines represent 
medians, the box limits are the 25th and 75th percentiles and the whiskers 
show the range of values. P-values were obtained using paired two-tailed 
t-tests. d, mRNA expression analysis (normalized to GAPDH) of selected genes 
in HEK293 cells expressing the indicated constructs. Data represent means 
fromn=2 technical replicates, and results are representative of three 
independent experiments. 


pocket (Fig. 3a). As such, we wondered whether ENL T mutants could 
bypass the acetyl-lysine-binding activity that is ordinarily required for 
chromatin targeting. To this end, we introduced a point mutation (Y78A, 
referred to asa ‘pocket mutation’ hereafter) that is known to abolish the 
acetyl-lysine-binding activity of the YEATS domain! into either wild-type 
or T-mutant ENL (Extended Data Fig. 8a). As expected, this pocket muta- 
tion severely reduced the chromatin occupancy of the otherwise wild- 
type ENL. Moreover, the chromatin occupancy of T mutants was also 
diminished upon introduction of the pocket mutation (Fig. 3b, c and 
Extended Data Fig. 8b-f). Consequently, tumour-mutation-induced 
activation of target genes was blunted (Fig. 3d). These results indicate 
that, like wild-type ENL, T-mutant ENL still requires its reader function 
for proper genomic localization. 

We then considered the possibility that ENL tumour mutations might 
drive enhanced chromatin occupancy by increasing the acetyl-lysine- 
binding affinity albeit at a distance from the defined binding pocket. 
To test this hypothesis, we performed quantitative isothermal titra- 
tion calorimetry (ITC) assays using the purified wild-type or mutant 
ENL YEATS domain and a histone H3 peptide comprising acetylated 
lysine 27. We found that although each T mutant showed variable 
degrees of interaction with the acetylated histone peptide, none of 
them exhibited an increase in acetyl-lysine binding compared with the 
wild type (Extended Data Fig. 8g, h). In addition, these tumour muta- 
tions did not increase the binding to other, longer acylations, such as 
crotonylation (Extended Data Fig. 8i)—another histone modification 
that the YEATS domain recognizes”. Together, these results strongly 
suggest that, although mutant ENL still depends onits reader function 
for chromatin targeting, tumour mutations enhance ENL accumulation 
at target sites through a mechanism that is distinct from the reading 
of histone acylation. 
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Increased self-association of ENL mutants 


Given the similar genomic localization of wild-type and mutant ENL pro- 
teins (Extended Data Fig. 4a-f), we next speculated that tumour muta- 
tions might drive the self-mediated recruitment of ENL to chromatin. 
To test this possibility, we co-expressed enhanced yellow fluorescent 
protein (eYFP)-labelled ENL fused with Lacl and mCherry-ENL without 
Lacl (Fig. 4a) in cells that contain a synthetic Lac operator (LacO) array 
inthe genome”, and examined the recruitment of ENL to the LacO 
locus. As expected, the LacO array recruited a large number of eYFP- 
ENL-Lacl molecules via targeted DNA binding, forming aconcentrated 
local interaction hub on the chromatin (Fig. 4b). We predicted that 
mCherry-ENL becomes enriched at the array only when it can associate 
with the co-expressed eYFP-ENL-Lacl through ENL self-association. We 
observed a modest self-association of wild-type ENL at the array, while 
allthree T mutants showed much stronger self-mediated recruitment 
(Fig. 4b, c). These results suggest that tumour mutations promote self- 
reinforced recruitment of ENL, and that this can occur independently 
of the initial recruitment mechanism (for example, histone acylation 
binding) and the underlying genomic target sequences. 

We also noticed that mutant, but not wild-type, ENL formed many 
smaller puncta outside of the LacO array (Fig. 4b), further supporting 
the idea of stronger self-association driven by the mutations. Consist- 
ently, we observed the formation of discrete puncta throughout the 
nucleus by T mutants over a wide range of expression levels (Fig. 4d, 
e and Extended Data Fig. 9a) when we expressed mutant mCherry- 
ENL alone. By contrast, wild-type mCherry-ENL was largely diffused 
throughout the nucleus when expressed at the same levels as the 
mutants. Of note, the puncta formed by T1 mutants were noticeably 
larger than those formed by T2 and T3 mutants (Extended Data Fig. 9b), 
correlating with the highest self-mediated recruitment of T1 to the 
LacO array (Fig. 4b, c). Notably, T1 exhibits a mutational pattern (an 
insertion) that is distinct from that of T2 and T3 mutants (deletion). 
Introduction of the Y78A mutation into all three T mutants had mini- 
mal impacts on puncta formation (Extended Data Fig. 9c, d). These 
results further support the conclusion that tumour mutations promote 
ENL self-association through a mechanism that is decoupled from the 
acylation-reading function of ENL. 

Further characterization of the puncta formed by ENL mutants 
revealed that they are spherical in shape (Fig. 4d), undergo fusion on 
contact (Supplementary Video SI) and are highly dynamic (Extended 
Data Fig. 9e, f). These features are characteristics of phase-separation- 
driven biomolecular condensates in other biological contexts” ’—an 
extreme form of local high-concentration hubs mediated by weak and 
dynamic multivalent molecular interactions. These results suggest 
that the self-association of mutant ENL involves multivalent inter- 
actions, which could be achieved by proteins composed of modular 
interaction domains or intrinsically disordered regions”. ENL contains 
a well structured YEATS domain, a predicted intrinsically disordered 
region (IDR), and an AHD region that mediates ENL’s interaction with 
binding partners such as SEC (Fig. 4f). To determine which regions of 
ENL are required for mutation-driven self-association and function, 
we generated a series of truncated ENL proteins harbouring the T1 
mutation (Fig. 4f). A YEATS domain alone with the T1 mutation was 
not sufficient to drive the formation of nuclear puncta (Fig. 4g and 
Extended Data Fig. 9g), suggesting that regions outside of the YEATS 
domain are also involved. Supporting this, deletion of the IDR, andtoa 
lesser extent of the AHD, compromised T1-driven ENL self-association 
and reduced puncta formation (Fig. 4g and Extended Data Fig. 9g). 
Enhanced chromatin occupancy driven by the T1 mutation was also 
attenuated to varying degrees by deletion of the IDR or AHD (Fig. 4h 
and Extended Data Fig. 9h). Lastly, we observed a substantial decrease 
in T1-induced gene activation upon deletion of the IDR or AHD (Fig. 4i). 
Of note, despite retaining partial self-association and chromatin target- 
ing, deletion of the AHD in T1 mutant ENL resulted in a profound defect 
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Fig. 4 | Tumour mutations enhance ENL self-association to drive reinforced 
recruitment and gene activation. a, Testing of ENL self-mediated recruitment 
to the LacO array. mCherry-ENL can be recruited to the array only through self- 
association with eYFP-ENL-Lacl proteins that have already been recruited. 

b, Fluorescence images of LacO-containing U2OS cells that have co-expressed 
various combinations of mCherry-ENL and eYFP-ENL-Lacl. White dashed 
circles indicate the LacO array. Scale bar, 5 um. c, Quantification of mCherry- 
ENL enrichmentat the LacO array bound by various eYFP-ENL-Lacl proteins. 
Enrichment of mCherry above an expression level of lsuggests ENL-ENL self- 
association. Shown are means +S.e.m.; n= 24, 9,13, 14, 51, 62, 39, 38 cells from 
left to right; one-tailed unpaired t-test. d, Fluorescence images of HEK293 cells 
that express similar levels of WT or mutant mCherry-ENL. Scale bar, 5 um. 

e, g, Fraction of in-puncta fluorescence intensity in the nucleus of HEK293 cells 
that express the indicated mCherry-ENL constructs as a function of mean 
nuclear intensity. Each dot represents one cell. f, Schematics of full-length (FL) 


or different deletion forms of ENL. AU, arbitrary units. h, ChIP with quantitative 
PCR (ChIP-qPCR) analysis of the indicated Flag-ENL constructs at HOXA genes 
in HEK293 cells. TSS, transcription start site. i, mRNA expression analysis 
(normalized to GAPDH) of HOXA genes in HEK293 cells expressing equal levels 
of the indicated ENL constructs. The data in panels h, irepresent means +s.e.m. 
fromn=3 technical replicates; independent experiments were repeated three 
times with similar results. j, During normal kidney development, wild-type ENL 
(ENLwt),a component of the SEC, binds to acetylated histone proteins in 
chromatin. The CDK9 component of the SEC phosphorylates RNA polymerase 
II (yellow circle on pol II), resulting in transcription appropriate to normal 
development. By contrast, mutant ENL (ENLmut) shows increased self- 
association and increased phosphorylation of pol II, resulting in aberrant gene 
activation that contributes to the development of Wilms tumour. Potential 
strategies to inhibit the oncogenic effects of ENL mutations are indicated by 
numbers 1-3. 
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in gene activation, further strengthening the conclusion that AHD- 
mediated interaction with SEC proteins is critical for ENL mutation- 
driven transcriptional control. These results suggest that, in addition 
to the YEATS domain, the IDR and, toa lesser extent, the AHD domain 
are also involved in mutation-driven ENL self-association, and provide 
further evidence that functionally links the enhanced self-association 
propensity to chromatin occupancy and gene activation. 


Discussion 

In this study, we have shown that cancer-associated hotspot muta- 
tions in a chromatin reader drive enhanced self-association (Fig. 4j). 
This gain-of-function property, coupled with its acylation-reading 
activity, is functionally required for mutant ENL to be recruited to 
chromatin and to control gene expression, thus providing a previ- 
ously unrecognized mechanism for driving developmentally critical 
genes into an extended active state to restrict cellular differentiation 
(Fig. 4j). These findings shed new light on how the dysregulation of 
chromatin-mediated mechanisms derails normal cell-fate control 
towards an oncogenic path, and unveil potential mechanism-guided 
strategies for inhibiting the oncogenic function of ENL mutations. 
These strategies include disrupting the interaction between the ENL 
YEATS domain and acylated histones, blocking the self-association of 
mutant ENL and inhibiting the activity of ENL-associated SEC (Fig. 4j). 
Notably, the enhanced self-association conferred by tumour mutations 
enables overexpressed ENL protein to form local hubs that involve weak 
and dynamic multivalent interactions and harbour characteristics of 
phase separation. Future studies are needed to probe the dynamics and 
regulation of mutant-ENL-driven interaction hubs at target chromatin, 
and to evaluate the extent to which these interaction hubs resemble 
recently described transcriptional clusters’*’*”. It remains to be seen 
whether other chromatin-associated proteins are hijacked in cancer 
in a similar fashion, but these gain-of-function mutations involving 
the acylation reader ENL in Wilms tumour and leukaemia expand our 
knowledge of the types of diseases that are caused by ‘misinterpreting’ 
histone modifications. These pathologies, together with the rapidly 
growing list of those that arise from ‘mis-writing’ or ‘mis-erasing’ his- 
tone marks”, highlight important roles of histone modifications in 
human health and disease that warrant further investigation. 
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Reporting summary 
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Extended Data Fig. 1|See next page for caption. 
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Extended Data Fig. 1| ENL mutations induce transcriptional changes that 
are implicated in developmental programs and in Wilmstumour. a, Bottom, 
ENL protein structure, with the region that is mutated in cancer showninred. 
Above, amino-acid sequences of the T1to T8 tumour-associated mutations and 
the corresponding WT region. b, c, Western blots showing the levels of 
ectopically expressed WT or mutant Flag-ENL proteins in HEK293 (b) and HK-2 
(c) cells. Independent experiments were repeated four times with similar 
results. B-Tubulin is used as aloading control. d, e, Venn diagrams showing the 
number and overlap of genes for which expression is significantly upregulated 
upon expression of mutant ENLas compared with WT ENL in HEK293 (d) and 
HK-2 (e) cells. Genes with a fold change of 1.5 or more anda false discovery rate 
(FDR) of 0.01or less are considered to be significantly upregulated. f, g, GSEA 
plots evaluating the changes in the indicated gene signatures upon expression 
of the indicated ENL mutants compared with WT in HEK293 (f) and HK-2 (g) 
cells. h,i, Volcano plots of RNA-sequencing data demonstrating the -log,5 
P-values versus log, fold changes in HEK293 (h) and HK-2 (i) cells. HOXA genes 


are highlighted in red. P-values were determined by two-tailed exact test, 
adjusted by FDR.j, Western blot showing the close-to-endogenous levels of 
ectopically expressed WT or mutant Flag—ENL in HEK293 cells. Independent 
experiments were repeated three times with similar results. k, mRNA 
expression analysis (normalized to GAPDH) of selected ENL target genes in 
HEK293 cells (from panelj) expressing the indicated constructs. vec, vector 
control. Data represent mean +s.e.m.,n=3 technical replicates, independent 
experiments were repeated three times with similar results. I, western blot 
showing the protein levels of ectopically expressed wildtype or indicated 
mutants (as illustrated in a) Flag-ENLin HEK293 cells. Experiment repeated 
three times independently with similar results. m, mRNA expression analysis 
(normalized to GAPDH) of selected ENL target genes in HEK293 cells (from 
panel I) expressing the indicated constructs. Vec, vector control. Data 
represent means fromn=2technical replicates; results are representative of 
three independent experiments. For gel source data (b,c,j,l), see 
Supplementary Fig. 1. 
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Extended Data Fig. 2| Three-dimensional nephron structures derived from 
mESCs. a, Diagram showing the in vitro directed differentiation assay. 
Signature genes expressed at each step are shownat the bottom. 7, the 
Brachyury gene. b, mRNA expression analysis (normalized to Gapdh) of the 
indicated genes at different time points during the assay. Data shown are 
representative of two independent experiments. c, Haematoxylin and eosin 
staining shows the induced embryoid body co-cultured with the spinal cord 
(sp). Green and red arrowheads point to nephric tubule and glomerulus, 
respectively. Scale bars: left, 500 um; middle, 100 pm; right, 50 pum. 
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d, Representative immunofluorescence staining of induced kidney structures 
for the nephric distal-tubule marker E-cadherin (green) and the glomerular 
marker WT1 (red). DNA is stained with DAPI (blue). Scale bar, 25 tm. e, 
Representative immunofluorescence staining of induced kidney structures for 
E-cadherin (green) and the proximal-tubule marker lotus tetragonolobus lectin 
(LTL, red). DNA was stained with DAPI (blue). Scale bar, 25 pm. f, Western blot 
showing the protein levels of ectopically expressed WT or mutant Flag-tagged 
ENLinmESCs. For gel source data, see Supplementary Fig. 1. For panelsc-f, 
independent experiments were repeated three times with similar results. 
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Extended Data Fig. 3| Characterization of ENL-mutant kidney structures. 
a,d,g, Representative haematoxylin and eosin staining of the indicated kidney 
structures. b, e, h, Representative immunohistochemistry staining of the 
indicated kidney structures for the proliferation marker Ki-67. 

c,f,i, Representative immunohistochemistry staining of the indicated kidney 
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structures for the mesenchymal marker vimentin. In panels c, f, the vimentin- 
positive cells shownare stroma cells. In paneli, the vimentin-positive cells 
shown are mostly blastema components. a-c, WT epithelium; d-f, mutant 
epithelium; g-i, mutant blastema. All experiments were repeated twice with 
similar results. Scale bars, 50 pm. 
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Extended Data Fig. 4| ENL mutants occupy largely the same genomic loci as 
the wild-type protein. a, b, Bar graphs showing the genomic distribution of 
Flag-ENL-bound peaks in HEK293 (a) and HK-2 (b) cells. c,d, Heat maps of 
normalized WT or mutant Flag-ENL ChIP-seq signals in HEK293 (c) and HK-2 
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(d) cells, centred on ENL-bound peaks across a +5-kb window. The colour key 
represents the signal density, with darker colour representing more signal. 
More details are in Supplementary Tables 4, 5.e, f, Venn diagrams showing the 
overlap of WT or mutant ENL-bound peaks in HEK293 (e) and HK-2 (f) cells. 
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Extended Data Fig. 5| Enhanced occupancy of ENL mutants at ashared 
subset of target genes correlates with gene activation. a, Venn diagram 
showing the number and the overlap of peaks with enhanced binding of 
individual mutant ENLs as compared with WT ENL in HEK293 cells. b, Heat 
maps of normalized WT or mutant Flag-ENL ChIP-seq signals in HK-2 cells, 
centred on mutant-enhanced peaks (fold change greater than 1.5) across 
a+5-kb window. More details are in Supplementary Table 7. c, Genome browser 
view of Flag-ENL ChIP-seq signals at selected target genes in HK-2 cells 
expressing indicated Flag-ENL transgenes. d,e, ChIP-qPCR of Flag-ENL at 
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selected ENL target genes in two batches of HEK293 cells that are expressing 
the indicated ENL transgenes at levels higher than those of the endogenous 
ENL protein (d; see Extended Data Fig. 1b) or close to endogenous levels (e; see 
Extended Data Fig. lj). Dataind represent means from n=2 technical 
replicates, and are representative of three independent experiments. Dataine 
represent means +s.e.m. fromn=3 technical replicates; independent 
experiments were repeated twice with similar results. f, GSEA plots showing 
that genes (n= 91; Supplementary Table 10) with enhanced occupancy of ENL 
mutants are upregulated in mutant-expressing HK-2 cells. 
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Extended Data Fig. 6 | Enhanced binding of ENL mutants at target genes 
leads to increased SEC recruitment and activity. a, Western blot analysis of 
co-immunoprecipitation (IP) using the M2 anti-Flag antibody with lysates from 
HEK293T cells that are expressing the indicated Flag-ENL constructs. The 
experiment was repeated twice with similar results. For source data, see 
Supplementary Fig. 1.b,c, GSEA plots of genes ranked by their fold-change 
(mutant over WT) of CDK9 (b) or Pol II S2P (c) ChIP-seq signals in HEK293 cells, 
annotated against the set of genes (n= 87) that show increased occupancy of 
ENL mutants compared with WT. d-f, ChIP—qPCR analysis of CDK9 (d), Pol IIS2P 
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(e) and H3K79mez2 (f; dimethylation of lysine 79 of histone H3) at selected ENL 
target genes in HEK293 cells that are expressing the indicated Flag-ENL 
constructs. Data represent means +s.e.m.;n=3 technical replicates. 
Experiments were repeated twice with similar results. g, mRNA expression 
analysis (normalized to GAPDH) of selected ENL target genes in HK-2 cells 
expressing the indicated constructs upontreatment with flavopiridol for 3h. 
Increasing dosages (0, 125nM, 250nM and 1,000 nM) are depicted by grey 
wedges. Datarepresent means from n=2 technical replicates. Experiments 
were repeated twice with similar results. 


0 
T2 T3 vec WT T1 


Extended Data Fig. 7 | Loss of PAF1 has minimal effect on the functionality of 
cancer-associated ENL mutants. a, Western blot analysis of co- 
immunoprecipitation using the M2 anti-Flag antibody in lysates from HEK293T 
cells expressing the indicated Flag—ENL constructs. Results are representative 
of three independent experiments. b,c, mRNA expression (b) and western blot 
(c) analysis showing the knockdown efficiency of short interfering (si)RNAS 
that target PAF1 in HEK293 cells. Independent experiments were repeated 
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twice with similar results. d, e, ChIP-qPCR analysis of Flag—ENL at selected ENL 
target genes in control (siCtrl) or PAFI knockdown (siPAF1) HEK293 cells. 

f, mRNA expression analysis (normalized to GAPDH) of selected ENL target 
genes in control (siCtrl) or PAFI knockdown (siPAF1) HEK293 cells. Dataind-f 
represent means +s.e.m.;n=3 technical replicates. For gel source datain 
panelsa,c,see Supplementary Fig. 1. 
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Extended Data Fig. 8 | Interaction with histone acylation is essential but not genes in HEK293 cells expressing the indicated constructs. Datarepresent 
sufficient for chromatin occupancy by cancer-associated ENL mutants. means of n=2 technical replicates. Independent experiments were repeated 
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+5-kb window. e, f, ChIP-qPCR analysis of Flag-ENL at selected ENL target crotonylated at lysine 27 (H3(19-30)K27Cro;i). 
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Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | Characterization of nuclear puncta formed by 
ectopically expressed ENL tumour mutants. a, Fraction of in-puncta 
fluorescence intensity in the nucleus of HEK293 cells that express WT or 
mutant mCherry-ENL, as a function of mean out-of-puncta nuclear intensity. 
Each dot represents one cell (the same experiment as in Fig. 4e). b, Dot plots 
showing the radius of punctain HEK293 cells that are expressing similar levels 
of the indicated mCherry-ENL proteins. n= 20 independent puncta, randomly 
selected from four different cells per group. P-values were obtained using two- 
tailed unpaired Student’s t-test. Centre lines represent medians; whiskers 
indicate the minimum to maximum range. c, d, Fraction of in-punctamCherry- 
ENL intensity in the nucleus as a function of mean nuclear intensity (c) or mean 
out-of-puncta nuclear intensity (d) in HEK293 cells that express the indicated 
mCherry-ENL proteins. Each dot represents one cell. e, Representative images 


from fluorescence recovery after photobleaching (FRAP) analysis in HEK293 
cells expressing T3 mutant mCherry-ENL. The white dashed circles indicate 
the punctum undergoing targeted bleaching. Images represent 14 FRAP 
experiments in total with T1/2/3 mCherry-ENL. f, Averaged FRAP curves from 
areas inside the mCherry-ENL puncta formed by the indicated ENL mutants. 
Bleaching occurs at t=O s. Datarepresent means +s.e.m.;n=6(T1),5 (12) and3 
(T3) distinct puncta from multiple cells. g, Fraction of in-puncta fluorescence 
intensity in the nucleus of HEK293 cells that express the indicated mCherry— 
ENL constructs asa function of mean nuclear out-of-puncta intensity. Each dot 
represents one cell (same experiment as in Fig. 4g). h, Western blot showing the 
protein levels of ectopically expressed Flag—ENLin HEK293 cells. Experiments 
were repeated three times with similar results. For gel source data, see 
Supplementary Fig. 1. 
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Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size For most high-throughput sequencing experiments, two independent biological replicates were used (e.g. RNA-seq, and most of ChIP-seq). 
Additional independent experiments were repeated and ChIP-qPCR and RT-qPCR were used to validate key results obtained from high- 
throughput sequencing. 


Sample sizes for other assays were not predetermined and were chosen based on our prior experience and common standards in the field for 
detecting statistically significant differences between conditions. 


Data exclusions No data was excluded from the analysis. 
Replication Each experiment was repeated (See Figure Legends) and all findings were reproducible. In most assays, the conclusions about functional 
difference between wild-type and mutant ENL were drawn from comparing wild-type and multiple distinct mutants (T1, T2, T3), which further 


strengthens the conclusions. 


Randomization | Samples were allocated to groups according to genotype or treatment. No randomization was required as the starting materials (e.g. parental 
cell lines before generation of sub-lines with different genotypes) are identical. 


Blinding The investigators were not blinded to allocation during experiments and outcome assessment except for pathological analysis in Figure. 1g. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 


Animals and other organisms 


Human research participants 


Clinical data 


Antibodies 


Antibodies used All antibodies used have been provided in Supplementary Tables with details (e.g. supplier name, catalog number, application 
and dilution) 
Validation For Immunoblot, the correct size of the detected bands was assessed based on the protein marker. 


1. Flag: Manufacturer indicates reactivity by immunoblot to detect Flag epitope-tagged proteins. We have demonstrated in prior 
work (Wan et al. 2017) that this antibody can be used for ChIP-seq and confirmed its specificity using cells without Flagged 
transgenes. 

2. ENL: Manufacturer indicates human reactivity by immunoblot. We have demonstrated its specificity in prior work (Wan et al. 
2017) using ENL KD cells. 

3. b-actin: Manufacturer indicates human and mouse reactivity by immunoblot. It is widely used in western blot as a loading 


control. 

4. GAPDH: Manufacturer indicates human and mouse reactivity by immunoblot. It is widely used in western blot as a loading 
control. 

5. b-tubulin: Manufacturer indicates human and mouse reactivity by immunoblot. It is widely used in western blot as a loading 
control. 


6. AFF1: Manufacturer indicates human reactivity by immunoblot. We have demonstrated its specificity in prior work (Wan et al. 
2017) using AFF1 KD cells. 

7. PAF1: Manufacturer indicates human reactivity by immunoblot. We have demonstrated its specificity using PAF1 KD cells. 

8. DOT1L: Manufacturer indicates human reactivity by immunoblot. We showed it can be used for western blot in current study. 
9. Myc: Manufacturer indicates reactivity by immunoblot to detect Myc epitope-tagged proteins. We have demonstrated its 
specificity in prior work (Wan et al. 2017) in co-immunoprecipitation assays. 

10. CDK9: Manufacturer indicates human and mouse reactivity by immunoblot. We showed that it can be used for ChIP-seq in 
current study. 

11. Pol Il (ser-2p): Manufacturer indicates human and mouse reactivity by immunoplot. We have demonstrated in prior work 
(Wan et al. 2017) that this antibody can be used for ChIP-seq. 
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12. H3K79me2: Manufacturer indicates human and mouse reactivity by immunoplot. We have demonstrated in prior work (Wan 
et al. 2017) that this antibody can be used for ChIP-seq. 

13. E-cadherin: Manufacturer indicates human and mouse reactivity by IF. 

14. WT-1: Manufacturer indicates human and mouse reactivity by IF. 

15. Biotinylated Lotus Tetragonolobus Lectin (LTL): Manufacturer indicates application for IF staining. 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) HEK293, HK-2 lines purchased from ATCC; mESCs cells derived in house 


Authentication For cells purchased directly from ATCC, cells are authenticated by sequencing at ATCC. During culture, parental lines were 
authenticated based on the testing and monitoring of phenotypic features (morphology, differentiation potential, growth 
conditions, etc.) characteristic of each line that were previously reported by manufacturers and other groups. Immunoblot 
analysis was performed to confirm the genotypes after transgene/siRNA expression. 
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Mycoplasma contamination All cell lines tested negative for mycoplasma contamination. 


Commonly misidentified lines No commonly misidentified lines were used in this study. 
(See ICLAC register) 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals For laboratory animals, report species, strain, sex and age OR state that the study did not involve laboratory animals. 


Wild animals Provide details on animals observed in or captured in the field; report species, sex and age where possible. Describe how animals 
were caught and transported and what happened to captive animals after the study (if killed, explain why and describe method; if 
released, say where and when) OR state that the study did not involve wild animals. 


Field-collected samples For laboratory work with field-collected samples, describe all relevant parameters such as housing, maintenance, temperature, 
photoperiod and end-of-experiment protocol OR state that the study did not involve samples collected from the field. 


Ethics oversight Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or 
guidance was required and explain why not. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


ChIP-seq 


Data deposition 


Confirm that both raw and final processed data have been deposited in a public database such as GEO. 


Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks. 


Data access links https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE125186 

May remain private before publication. Token: gdwbimyqlhyxlar 
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Methodology 


Replicates HEK293 Flag ChIP-seq: wt, T1 ,12, T3 and control: 2 replicates 
HEK293 Flag-ENL wt, T1, T2 and T3 RNA-seq: 2 replicates 
HK-2 Flag-ENL wt, T1, T2 and T3 RNA-seq: 2 replicates 
all others: 1 replicate 
Note: Key target genes were validated by ChIP-qPCR and gene expression qPCR analyses for all conditions in additional, 
independent repeated experiments (n>3). 


Sequencing depth Sample Sequencing depth 
GSM3564782 ChIPSeq.Polll_S2P_T1_mut 74852492 
GSM3564783 ChIPSeq.Polll_S2P_T2_mut 86431056 
GSM3564784 ChIPSeq.Polll_S2P_T3_ mut 82360591 
GSM3564785 ChIPSeq.Polll_S2P_WT_ENL 67652603 
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GSM3564786 ChIPSeq.CDK9_T1_mut 66925666 
GSM3564787 ChIPSeq.CDK9_T2_mut 76629800 
GSM3564788 ChIPSeq.CDK9_T3_mut 94390261 
GSM3564789 ChIPSeq.CDK9_WT_ENL 84910393 
GSM3564790 ChIPSeq.HEK293_WT_ENL_B 68322465 
GSM3564791 ChIPSeq.HEK293_T1_mut_C 93069874 
GSM3564792 ChIPSeq.HEK293_T1_Y78A_mut 75093254 
GSM3564793 ChIPSeq.HEK293_T2_B_mut 61175503 
GSM3564794 ChIPSeq.HEK293_T2_Y78A_mut 68255927 
GSM3564795 ChIPSeq.HEK293_T3_B_mut 68788471 
GSM3564796 ChIPSeq.HEK293_T3_Y78A_mut 69951070 
GSM3564797 ChiPSeq.HEK293 vec 61797238 
GSM3564798 ChIPSeq.HEK293_Y78A_B mut 68043038 
GSM3564799 ChIPSeq.HEK293_F_ENL 71607516 
GSM3564800 ChIPSeq.HEK293_F_T1_mut 72912339 
GSM3564801 ChiPSeq.HEK293_F_T2_mut 82750115 
GSM3564802 ChiPSeq.HEK293_F_T3_mut 79967093 
GSM3564803 ChIPSeq.HEK293_ input 82872696 
GSM3564804 ChIPSeq.HK2_F_ENL 80396084 
GSM3564805 ChiPSeq.HK2_F Tl mut 62880686 
GSM3564806 ChIPSeq.HK2_F_T2 mut 59955976 
GSM3564807 ChiPSeq.HK2_F T3 mut 62638629 
GSM3564808 ChIPSeg.HK2_input_S31 63931469 


Antibodies Antibodies used in ChIP-seq in this study: Flag (Sigma, F1804), CDK9 (Santa Cruz, sc-13130), Pol Il S2P (Abcam, ab5095), and 
H3K79me2 (Abcam, ab3594), and more detailed in Supplementary Method and Supplementary Tables 


Peak calling parameters bowtie hg19 -v 1 -r --best --strata -m 1 -p 8 *.seq *.bowtie 
macs14 -t *.bed -c Input.bed -g hs --tsize=* --wig 


Data quality FASTQC 0.11.8 is run to check the sequencing quality. 


Software FASTQC 0.11.8;Bowtie1 v1.1.0; MACS (version 1.4.2); R package VennDiagram v1.6;deepTools v2.0 
edgeR 3.26.0. 
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Neurodegeneration in patients with Parkinson’s disease is correlated with the 
occurrence of Lewy bodies—intracellular inclusions that contain aggregates of the 
intrinsically disordered protein a-synuclein’. The aggregation propensity of 
a-synuclein in cells is modulated by specific factors that include post-translational 
modifications”, Abelson-kinase-mediated phosphorylation** and interactions with 
intracellular machineries such as molecular chaperones, although the underlying 
mechanisms are unclear® ®. Here we systematically characterize the interaction of 
molecular chaperones with a-synuclein in vitro as well as in cells at the atomic level. 
We find that six highly divergent molecular chaperones commonly recognize a 
canonical motif in a-synuclein, consisting of the N terminus and a segment around 
Tyr39, and hinder the aggregation of a-synuclein. NMR experiments?’ in cells show 
that the same transient interaction pattern is preserved inside living mammalian cells. 
Specific inhibition of the interactions between a-synuclein and the chaperone HSC70 
and members of the HSP90 family, including HSP908, results in transient membrane 
binding and triggers a remarkable re-localization of a-synuclein to the mitochondria 
and concomitant formation of aggregates. Phosphorylation of a-synuclein at Tyr39 


directly impairs the interaction of a-synuclein with chaperones, thus providing a 
functional explanation for the role of Abelson kinase in Parkinson’s disease. Our 
results establish a master regulatory mechanism of a-synuclein function and 
aggregation in mammalian cells, extending the functional repertoire of molecular 
chaperones and highlighting new perspectives for therapeutic interventions for 


Parkinson’s disease. 


We characterized the interactions of an array of molecular chaperones 
with a-synuclein on the basis of previous findings that have shown that 
molecular chaperones share common patterns of client recognition”. 
The array included human HSC70 and HSP90, and bacterial chap- 
erones SecB, Skp, SurA and Trigger Factor, all of which have strongly 
diverse architectures”. All of these chaperones interfered functionally 
with the aggregation of a-synuclein in a thioflavin T assay®*”, show- 
ing effects already at a stoichiometry of 1:20 (chaperone:a-synuclein) 
and even stronger effects at 1:10 ratios (Fig. la—c). The known HSP90 
inhibitors geldanamycin and radicicol (referred to hereafter as drugs) 
decreased the chaperoning effect of HSP90f8 (Fig. 1c), consistent with 
the known mechanism of these drugs”. We determined the segments 
of a-synuclein that interact with the individual chaperones at the atomic 
level by measuring the attenuation of the NMR signal intensity and 
chemical-shift perturbations using two-dimensional [°N, 'H]-NMR 


spectroscopy. For all 6 chaperones, the effects were most pronounced 
for 12 amino acid residues at the N terminus and for 6 residues around 
Tyr39, indicating that a direct—albeit transient—intermolecular interac- 
tion occurs via these 2 segments, which are therefore identified as the 
canonical chaperone-interaction motif of a-synuclein (Fig. 1d-g and 
Extended Data Figs. 1, 2). Inhibition of HSP90B using drugs partially 
impaired the interaction with a-synuclein. For HSC70, the interac- 
tion was observed in the ADP-bound (HSC70,pp) and the ATP-bound 
(HSC70,;p), but not the apo, state (Fig. 1g and Extended Data Fig. 3), 
consistent with previous reports®”** (Supplementary Discussion). 
Notably, for all six chaperones, the interactions were observed at 
protein concentrations of 100 pM, which suggests that these interac- 
tions are unlikely to arise from nonspecific effects of macromolecular 
crowding. We investigated such nonspecific effects using high con- 
centrations of either bovine serum albumin (BSA) or ubiquitin. The 
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Fig. 1|Molecular chaperones prevent aggregation through the interaction 
with the N terminus of a-synuclein. a, b, Thioflavin T (ThT) emission curves of 
300 uM a-synuclein inthe presence or absence of chaperones (15 1M (a) or 

30 uM (b)).c, Thioflavin T emission curves of 100 pM a-synuclein inthe 
presence of 51M HSP906 with and without the addition of 1 pM of drugs. 

a-c, Dataare mean+s.d. (n=3). AU, arbitrary units. d, Overlay of two- 
dimensional [5N,‘H]-NMR spectra of 250 pM [U->N]-a-synuclein in the absence 
(grey) and presence (yellow) of 500 uM of SecB tetramer (n=3, with similar 
results). e, Residue-resolved backbone amide NMR signal attenuation (/,., =//Ip) 
of a-synuclein upon addition of two equivalents (eq.) of SecB tetramer (yellow), 


signal was not attenuated after addition of 150-310 mg mI ‘ubiquitin, 
thus excluding the possibility that these interactions arose because 
of macromolecular crowding effects. For high concentrations of BSA 
the canonical chaperone-interaction signature is observed (Fig. 1g 
and Extended Data Fig. 3d-j), owing to the weak molecular chaper- 
one function of BSA”. Taken together, these experiments using six 
chaperones and two control proteins revealed that there is a canonical 
chaperone interaction with a-synuclein at the N terminus and around 
Tyr39 that is transient in nature. Notably, it comprises the two segments 
of a-synuclein that are locally the most hydrophobic (Extended Data 
Fig. 3k, 1), indicating an importance of hydrophobic residues for the 
interaction with chaperones. 

To characterize the physiological role of chaperone-a-synuclein 
interactions, we determined the affinity of a-synuclein for HSC70,pp, 
SecB and Skp using bio-layer interferometry. a-Synuclein binds to each 
of these chaperones with affinities ranging from 1 to 2 uM (Extended 
Data Fig. 4 and Supplementary Table 1). The AN-a-synuclein variant, 
which lacks 10 N-terminal residues, shows a decrease in affinity of two 
orders of magnitude, validating that this segment is part of the inter- 
action site. At the reported cellular concentrations of a-synuclein in 
neuronal synapses of approximately 50 uM combined with a concen- 
tration of around 70 pM of the chaperones HSP70 and HSP90"8, about 
90% of cellular ~-synuclein can therefore be bound to chaperones. 

We then analysed published data on the NMR intensity profiles of 
a-synuclein inside living mammalian cells, and found that these data 
feature the canonical chaperone-interaction signature’. Because this 
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Trigger Factor dimer (orange), Skp trimer (red) or SurA dimer (dark red). 

f, Overlay of two-dimensional [°N, 'H]-NMR spectra of [U-°N]-a-synucleinin the 
absence (grey) and presence (cyan) of two equivalents HSP90B dimer (n=2, 
with similar results). g,h, Residue-resolved backbone amide NMR signal 
attenuation (/,.|=//Ip) of a-synuclein upon addition of two equivalents of 
HSP90f8 dimer (cyan), HSC7Oqp» (light blue), and ubiquitin (dark blue) as well as 
E.colicell extract (green), mammalian MDCK-II cell extract (blue) and 
mammalian HEK293 cell extract (green). e,g,h, Values that are less than1.0 
indicate intermolecular interactions. 


pattern has been suggested to arise from interactions with cellular 
membranes, we first characterized a-synuclein in soluble cellular 
extracts, which were devoid of membranes, from Escherichia coli cells 
or mammalian HEK293 and MDCK-II cells. Notably, in each case we 
observed the canonical chaperone-interaction pattern (Fig. 1h and 
Extended Data Fig. Sa—d), indicating that this pattern does not result 
from the interaction with membranes. Second, we characterized the 
interaction pattern of a-synuclein with lipid bilayer membranes in vitro. 
Titrating large unilamellar vesicles (LUVs) with a-synuclein in a 125:1 
lipid:protein ratio leads to a uniform decrease in the NMR signal for 
amino acid residues 1-90 of a-synuclein (Extended Data Fig. 6a), in 
agreement with previously published reports®”’. Adding 2-6 equiva- 
lents of SecB to solutions containing a-synuclein and LUV restored 
the chaperone signature, whereas the reverse experiment—that is, 
the addition of LUVs to an existing SecB-a-synuclein complex—led 
to attenuation of the NMR signal for amino acid residues 1-90 of 
a-synuclein, indicating that LUVs and SecB mutually compete for bind- 
ing to a-synuclein (Extended Data Fig. 6). Overall, the data suggest that 
a-synuclein is in an equilibrium between its free state, its membrane- 
bound state and its chaperone-bound state, of which the last two states 
are mutually exclusive. The emerging hypothesis that, in mamma- 
lian cells, a-synuclein is predominantly in contact with chaperones 
rather than with the lipid bilayer was supported by the experimental 
determination of the interactome of the N terminus of a-synucleinin 
mammalian cells using chemical cross-linking and mass spectrometry. 
The interactome consists of a large number of molecular chaperones 
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Fig. 2|Theinteraction between a-synuclein and chaperones is dominantin 
living cells. a, Abundance ratios of proteins bound to AN-a-synuclein versus 
wild-type full-length a-synuclein determined by relative quantitative mass 
spectrometry (mean values, n= 2). b, Overlay of two-dimensional [N, ‘H]-NMR 
spectra of [U-SN]-a-synuclein in NMR buffer (black) and inside living HEK293 
cells (blue-green). Representative spectrum fromn>5.c, Residue-resolved 
backbone amide NMR signal attenuation (/jex//purrer) Of a-synucleinin 


that had abundances ranging between 30 and 75%, including several 
isoforms of HSP90 and six HSP70 isoforms (Fig. 2a; see Supplementary 
Information for details). 


NMR spectroscopy in cells 


Next, we carried out NMR experiments in cells to study the interaction 
between a-synuclein and chaperones inside living mammalian cells 
at atomic resolution. [U-°N]-a-Synuclein was delivered into HEK293 
cells at concentrations of 3-10 uM, yielding intensity patterns that are 
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mammalian cells. d, NMR signal attenuation in treated cells, relative to 
untreated cells (//Iy¢,). Different combinations of HSC70 depletion and HSP90 
inhibition were applied, as indicated. e, f, Overlay of two-dimensional [N, 'H]- 
NMR spectra of [U-N]-a-synuclein in untreated HEK293 cells (black) and in 
HSC70-depleted HEK293 cells (green) (e) or in HSC70-depleted HEK293 cells 
after 24h of HSP90 inhibition (green) (f). Representative data (d-f) for three 
technical replicates, with similar results. 


characteristic for mammalian cell lines? (Fig. 2b, c), suchas the canoni- 
cal chaperone-interaction signature. Multiple molecular chaperones 
are present in the cell that have mutually overlapping functions and 
‘clientomes”°. To complement the in vitro chaperone analyses, we 
investigated two of the most abundant chaperones found in mammalian 
cells, HSC70 and HSP908. When [U-N]-a-synuclein was delivered into 
HEK293 cells with reduced HSC70 levels (Extended Data Fig. 7c, d), the 
NMR intensity profile resembled the one observed for untreated cells, 
suggesting that there is functional redundancy between HSC70 and 
other chaperones in these cells (Fig. 2d, e). Next, we treated HEK293 
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Fig. 3 | Co-localization of x-synuclein and cellular organelles assessed using 
immunofluorescence. a-f, Immunofluorescence analysis of a-synuclein 
electroporated into HEK293 cells. Cells were treated with either a control short 
hairpin RNA (shRNA) targeting Firefly luciferase (SshLUC) or acombination of 
an shRNA targeting HSC70 (shHSC70) and inhibitors of HSP90 (shHSC70 + 
drugs). Cells were stained with MitoTracker (red; a) to stain mitochondria, DAPI 
(blue) to stain cell nuclei, an a-synuclein-specific antibody (green) and either 
wheat germ agglutinin (WGA; red in b) to stain the plasma membrane and the 
endoplasmatic reticulum or LysoTracker (red inc) to stain acidic vesicles such 
as lysosomes. Outlines indicate areas of intense signal for MitoTracker and 
a-synuclein. Solid outlines, top magnifications; dashed outlines, bottom 
magnifications. d, Cox IV (red, mitochondrial marker) and a-synuclein (green) 


cells with the HSP90-inhibiting drugs, and found that the canonical 
chaperone-interaction motif showed increased intensities compared 
to untreated cells (Fig. 2d). This suggests that HSP90 chaperones physi- 
cally and transiently interact with a-synuclein in cells, and that this 
interaction is lost upon drug treatment. Immunoprecipitation assays 
confirmed that this interaction is almost completely lost 24 h after treat- 
ment (Extended Data Fig. 7e). Finally, we simultaneously inhibited both 
HSC70 and HSP90, and observed a moderate effect on the canonical 
chaperone-interaction motif 4h after treatment, at which point a sub- 
stantial fraction of HSP90 still remains bound to a-synuclein (Extended 
Data Fig. 7e). At this time point, a low but measurable amount of free 
intracellular a-synuclein was observed (Fig. 2d). At24 h after treatment, 
amarked global reduction in the signal of amino acid residues 1-90 of 
a-synuclein was observed, which was essentially identical to the LUV 
interaction pattern and to the profile that has previously been reported 
in which a-synuclein was bound to bacterial membranes” (Fig. 2d, f). 
The combined inhibition of the two types of chaperone (HSC70 
and HSP90) therefore leads to a transient membrane interaction of 
a-synuclein, which is absent in the basal state of cells. Furthermore, in 
these experiments, we observed the formation of stable high-molecu- 
lar-mass aggregates that contained a-synuclein (Extended Data Fig. 7f). 
Overall, these in-cell NMR and in vitro experiments show that, in cells, 
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a-Synuclein DAPI 


a-Synuclein DAPI 


7pesynuclein 
-71 


were visualized by specific antibodies, nuclei were stained with DAPI (blue). 
Circles indicate cells with high a-synuclein content, brackets indicate cells with 
low a-synuclein content. Solid outlines, magnified onthe right. Arrows 
indicate the positions of selected colocalization spots. e, f, Control HEK293 
cells (ShLUC; e) or HEK293 cells treated for the combined knockdown of HSC70 
and inhibition of HSP90 (shHSC70 + drugs; f) were stably transfected with an 
expression plasmid containing the mitochondrial marker mtBFP. Cells were 
fixed and subjected to immunofluorescence analyses using an anti-o-synuclein 
antibody. Propidium iodide (PI) was used to stain cells to enable the 
visualization of cell morphology. Note, the blue colour of mtBFP was changed 
to green to better visualize the co-localization of mtBFP and a-synuclein. Scale 
bars, 10 pm. Experiments were performed twice, with similar results. 


a-synuclein transiently interacts with a pool of constitutively expressed 
chaperones and that this interaction predominates over the transient 
interaction of a-synuclein with lipid bilayer membranes. In cells such 
as neurons®, as well as in our experiments using HEK293 cells, the 
concentration of chaperones is substantially larger than the concentra- 
tions of a-synuclein, highlighting the physiological relevance of these 
observations (Extended Data Fig. 7g, h). 


Intracellular membrane localization 


The interactions between a-synuclein and cellular membranes after 
inhibition of HSC70 and HSP90 may bea key mechanism for disease 
pathogenesis and we thus aimed to identify the membranous organelle 
that is involved using co-localization analyses. To this end, control 
cells and HEK293 cells depleted of HSC70 and treated with drugs for 
24 hwere first stained with MitoTracker (which stains mitochondria), 
LysoTracker (which stains acidic vesicles such as lysosomes) or Alexa- 
Fluor-labelled wheat germ agglutinin (which stains the plasma mem- 
brane and endoplasmic reticulum) and subsequently immunostained 
with anti-a-synuclein antibodies. These experiments revealed a strong 
colocalization of a-synuclein with mitochondria after the chaperones 
were depleted (Fig. 3a—c). To further confirm this association, we 
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Fig. 4 | Effect of post-translational modifications onthe 
chaperone-a-synuclein interaction. a, Modified a-synuclein variants. 
b-e, Residue-resolved backbone amide NMR signal attenuation (A/,..=1-//Ip) 


carried out immunofluorescence analyses using antibodies that were 
specific to the mitochondrial marker CoxIV and a-synuclein (Fig. 3d). In 
acomplementary experiment, we expressed the marker mitochondrial 
blue-fluorescent protein (mtBFP) in control and HSC70- and HSP90- 
deficient HEK293 cells and stained a-synuclein with antibodies (Fig. 3e, 
f). Both approaches confirmed the localization of a-synuclein to mito- 
chondria after HSC70 and HSP90 inhibition. 


Effect of post-translational modifications 


After establishing the canonical chaperone-interaction signature 
and validating its presence in living mammalian cells, we investigated 
the effect of chemical modifications on the a-synuclein-chaperone 
interaction. Using the chaperones HSP90B, HSC70 pp, SecB and Skp, 
we analysed the effects of N-terminal acetylation of a-synuclein, the 
predominant form in mammalian cells”. N-terminal acetylation does 
not interfere with the interaction between a-synuclein and chaperones 
(Fig. 4 and Extended Data Fig. 8a—g). By contrast, AN-a-synuclein has 
a greatly reduced interaction with all chaperones, in agreement with 
the bio-layer interferometry experiments, and showing a synergistic 
effect between the N terminus and the amino acid region around Tyr39 
(Fig. 4b-e). Cellular oxidative stress and an imbalance in reactive oxy- 
gen species are known hallmarks of the onset of Parkinson’s disease, 
leading to the oxidative modification of a-synuclein”. Titrating of 
HSP90B, HSC7O,qpp, SecB or Skp with methionine-oxidized a-synuclein” 
showed that oxidation of Met1 and Met5S abolish the N-terminal chap- 
erone interaction (Fig. 4 and Extended Data Fig. 9). Next, we explored 
the effects of phosphorylation on the interaction with chaperones, 
using in vitro tyrosine phosphorylation by different kinases>” (Fig. 4 
and Extended Data Fig. 9). Titration of SecB, Skp, HSP9OB or HSC70 app 
with either tetra-phosphorylated or Tyr39-mono-phosphorylated 
a-synuclein resulted in the elimination of the chaperone interaction, 
whereas Tyr125-Tyr133-Tyr136-tri-phosphorylated a-synuclein showed 
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of the a-synuclein variants upon interaction with two equivalents of HSP9OB 
dimer (b), HSC7Oqp» (c), SecB tetramer (d) or Skp trimer (e). Increased A/,., 
values are indicative of aninteraction. 


the chaperone-interaction pattern of unmodified a-synuclein (Fig. 4). 
Tyr39 phosphorylation therefore has a specific inhibitory effect onthe 
interaction with chaperones, providing a direct rationale for in vivo 
studies that have shown that upregulation of Abelson kinase (c-Abl) 
correlates strongly with Tyr39 phosphorylation and disease progres- 
sion in Parkinson’s disease*”. 


Conclusion 


Insummary, we have identified a functional mechanism for the regula- 
tion of a-synuclein by chaperones in mammalian cells through tran- 
sient binding (Extended Data Fig. 10). Molecular chaperones bind 
to a-synuclein through a canonical motif, by recognizing intrinsic 
biophysical features at the N terminus and around Tyr39. The interaction 
is abrogated after inhibition of two major chaperones, and results in 
transient interactions of a-synuclein with cellular membranes and relo- 
calization of a-synuclein to mitochondria. Aggregates of a-synuclein, 
as well as mitochondria, have been identified as major components of 
Lewy bodies”*”’. We propose a model in which a-synuclein is predomi- 
nantly found inatransient chaperone-interacting state in healthy cells, 
indicating that chaperones area master regulator of the cellular states 
of a-synuclein. The model also predicts that changes in the activity or 
cellular levels of chaperones or a-synuclein—or the modulation of their 
interaction—will disturb the homeostatic balance, eventually causing 
or promoting Parkinson’s disease. Notably, this model isin agreement 
with a multitude of reported experimental observations (Supplemen- 
tary Discussion), including studies that have shown that the ratio of 
a-synuclein to chaperone is deteriorated in familial parkinsonism and 
that oxidative stress can lead to an increase in the phosphorylation 
of Tyr39 of a-synuclein*”’, which interferes with chaperone binding. 
The model further shows how modulation of chaperone activity might 
prevent the formation of oligomeric a-synuclein, the aggregation of 
which leads to the disruption of the mitochondrial membrane”, and 
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also accounts for recent reports that impairment of mitochondria may 
constitute an important factor in Parkinson’s disease” *. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 
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The data that support the findings of this study are available from the 
corresponding authors upon request. 
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Extended Data Fig. 1| Interaction between a-synuclein and bacterial 
chaperones. a-c, Overlay of two-dimensional [°N,'H]-NMR spectra of 250 pM 
[U-SN]-a-synuclein in the absence (grey) and presence (orange, red or dark red) 
of 500 pM chaperones. The sequence-specific assignments for significantly 
affected resonances are indicated. d, Residue-resolved chemical-shift 
perturbations of a-synuclein caused by the addition of two equivalents of SecB 
tetramer (yellow), Trigger Factor dimer (orange), Skp trimer (red) or SurA 
dimer (dark red). Broken lines indicate a significance level of twos.d. fromthe 
mean. e, Temperature dependence of the a-synuclein interaction with either 


SecB (yellow) or Skp (red) monitored by residue-resolved intensity ratios 

(I-e\ = I/Ip) of 8C-direct-detected two-dimensional [°N, °C]-NMR spectra. The 
intensity ratios of two-dimensional [°N,'H]-NMR spectra at 281 K (Fig. 1c) are 
shownasan outline (grey). f, g, Overlay of two-dimensional [?C, “N]-NMR 
spectra of 500 pM [U-8C, »N]-a-synuclein in the absence (grey) and presence of 
1mM of SecB tetramer (f; yellow) or 1mM of Skp trimer (g; red). Experiments 
were performed at 281 K and 310 K as indicated. The sequence-specific 
resonance assignment is shown. Experiments ina-c, f, g were donein 
duplicates, with similar results. 
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Extended Data Fig. 2| Chaperones Skp and Trigger Factor bind a-synuclein 
at their native client sites. a, Overlay of two-dimensional [5N,'H]-NMR spectra 
of 250 pM [U’H, ©N]-Skp in the absence (grey) and presence (red) of 750 uM 
a-synuclein. b, Residue-resolved NMR signal intensity ratios (/,.=//Ip) of Skp 
(250 uM) in the presence of three equivalents of a-synuclein measured at 310 K. 
The thin dashed lines indicate a significance level of ones.d. fromthe mean. 
The solid line represents an intensity ratio of 1.c, a-Synuclein induced intensity 
changes plotted onthe Skp crystal structure (RCSB Protein Data Bank code 
(PDB) 1SG2)* and previously reported effects upon binding of its native client 
OmpX”. A decrease in the signal intensity of more than ones.d. is highlighted in 
blue, whereas an increase in signal intensity is highlighted in red. d, Overlay of 
two-dimensional [5N, 'H]-NMR spectra of 250 uM [U?H, °N]-Skp in the absence 
(grey) and presence (blue) of 500 uM BSA. e, Residue-resolved NMR signal 
intensity ratios (/,.=//Ip) of Skp (250 1M) in the presence of two equivalents of 
BSA measured at 310 K. The solid line represents an intensity ratio of 1. 

f, Overlay of two-dimensional [°N, 'H]-NMR spectra of 250 uM [U-H, »N]- 
TF(ARBD), amonomeric Trigger Factor (TF) variant that lacks its ribosome- 
binding and main dimerization domain (RBD), inthe absence (grey) and 


presence (orange) of 750 uMa-synuclein. g, Residue-resolved NMR signal 
intensity ratios (/,.,=//Ip) of 250 LM TF(ARBD) in the presence of three 
equivalents of a-synuclein measured at 298 K. The thin broken lines indicatea 
significance level of ones.d. fromthe mean. The thick line represents an 
intensity quotient of 1.h, Residue-resolved combined chemical-shift 
differences of the amide moieties. The broken line indicates a significance level 
oftwos.d. from the mean. i, Significant chemical-shift changes (green) and 
intensity decrease (blue) plotted onthe Trigger Factor structure (PDB 1W26)”. 
Comparison with the published Trigger Factor interaction sites of PhoA 
(orange)**.j, Overlay of two-dimensional [°N, 'H]-NMR spectra of 250 uM 

[U?H, N]-TF(ARBD) in the absence (grey) and presence (blue) of 500 uMBSA. 
k, Residue-resolved NMR signal intensity ratios (/,.,=//I) of TF(ARBD) (250 pM) 
inthe presence of two equivalents of BSA measured at 298 K. The solid line 
represents an intensity ratio of 1. Experiments with a-synuclein (a, f) were done 
as duplicates yielding similar results, whereas control experiments with BSA 
(d,j) were performed once. 
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Extended Data Fig. 3| See next page for caption. 


Extended Data Fig. 3 | Interaction between a-synuclein and mammalian 
proteins. a, Overlay of two-dimensional [°N,'H]-NMR spectra of 25 pM [U-5N]- 
a-synuclein in the absence (grey) and presence (light blue) of 50 pMinhibited 
HSP90f8 dimer. Measured in NMR buffer plus 5 mM MgCl,,5 mM ATP, 1p.M 
radicicol and1 1M geldanamycin. b, Overlay of two-dimensional [5N, ‘H]-NMR 
spectra of 100 pM [U->N]-a-synuclein in the absence (grey) and presence (light 
blue) of 200 pM HSC70. ¢, Overlay of two-dimensional [°N,'H]-NMR spectra of 
100 pM [U-SN]-a-synuclein in the absence (grey) and presence (light blue) of 
200 LM HSC70,pp. Measured in NMR buffer plus 5 mM MgCl, and 5 mM ADP. 

d, Overlay of two-dimensional [*N, 'H]-NMR spectra of 100 pM [U-SN]-a- 
synuclein inthe absence (grey) and presence (light blue) of 200 LM HSC70 7p. 
Measured in NMR buffer plus 5 mM MgCl, and 5 mM ATP.e, Overlay of two- 
dimensional [*N, 'H]-NMR spectra of 250 pM [U->N]-a-synuclein in the absence 
(grey) and presence (blue) of 500 1M (33 mg mI) BSA. f, Overlay of two- 
dimensional [5N,‘H]-NMR spectra of 250 pM [U->N]-a-synuclein in the absence 
(grey) and presence (dark blue) of 500 uM of ubiquitin. g, Residue-resolved 
combined chemical-shift perturbations of amide moieties upon addition of 
HSP90f (cyan), inhibited HSP908 (light cyan), HSC70 (light blue), HSC70,pp 
(light blue), HSC7Oq7p (light blue), BSA (blue) and ubiquitin (dark blue). Broken 


lines indicate a significance level of twos.d. from the mean. h, Residue-resolved 
backbone amide NMR signal attenuation (/,.,=//Io) of a-synuclein caused by the 
addition of two equivalents of inhibited HSP90f (light cyan), HSC70 (light 
blue), HSC7Oq7p (light blue) and BSA (blue). i, Residue-resolved NMR signal 
attenuation (/,.,=//I)) of 100 uM [U-SN]-a-synuclein upon addition of increasing 
BSA concentrations (50-250 mg mI°’).j, Residue-resolved NMR signal 
attenuation (J,.)=//Ip) of 50 uM [U-N]-a-synuclein upon addition of increasing 
ubiquitin concentrations (25-125 mg ml). k, Local hydrophobicity of 
a-synuclein plotted against the amino acid sequence. AFare the free energies 
of transfer of the individual amino acids from an aqueous solution to its 
surface®. Hydrophobicity corresponds to negative AF values. An exponentially 
weighted seven-window average was applied to the raw data, with the edges 
contributing 50%. The red line indicates the average value of 1.5s.d. fromthe 
mean, the chosen threshold for the identification of the most hydrophilic 
segments. I, Sequence-dependent Dnak score for a-synuclein derived froma 
computational DnakK prediction algorithm*®. Regions of the primary sequence 
with scores less than —5 (red line) are predicted to bind DnaK, a bacterial 
homologue of HSC70. Experiments in a-f were done in duplicates with similar 
results. 


Article 


a Biotin-SecB b Biotin-Skp c Biotin-HSC70,.,, 


a-synuclein a-synuclein a-synuclein 
0.3 


0.2 


0.1 


0.06 


renee] “I ———— 


-0.02 


acetyl—a-synuclein 
0.3 


°o 
nD 


° 
a 


Wavelenght 
Shift (nm) 


0.01 0. 


03 
-0.01 -0.03 


100.0 uM 
50.0 uM 
0.67 25.0 uM 


100.0 uM AN-a-synuclein 
50.0 uM 


25.0 uM 
12.5 uM 


Og SR Es FE TTS TTT 


py Te iaLsarbd fk acleah 7 
/ Residuals 
200 400 600 801 


To iii Ae ci nd sis 


-0.08 -0.01 


-0. 
le) fe) 200 400 600 800 
Time (s) Time (s) Time (s) 
Extended Data Fig. 4 | Kinetic analysis of the interaction of the chaperones Black lines represent least-square fits to the data. The residuals of the fits are 


with a-synuclein variants. a—c, Kinetic analysis by bio-layer interferometry of shown below each set of bio-layer interferometry curves. Each individual 
biotinylated Skp (a), SecB (b) and HSC70,p» (c) to different a-synuclein variants kinetic experiment was run twice in triplicates with similar results. 
(a-synuclein (top), acetyl-a-synuclein (middle) and AN-a-synuclein (bottom). 


a-synuclein 
— 25 mg/ml E.coli extract 


8,('H) [ppm] 


Ad (HN) 
{ppm] 


20 40 60 


a-synuclein 
el) mg/m! MDCK-II extract} 


80 100 


@ osynuclein 
pele mg/ml HEK-293 extract 


115 


8,(°N) 
[ppm] 


E.coli extract 


MDCK-II extract 


HEK-293 extract 


120 140 


a-synuclein residue number 


Extended Data Fig. 5 | Interaction between a-synuclein and cellular 
extracts. a, Overlay of two-dimensional [°N,'H]-NMR spectra of 50 uM [U-*N]- 
a-synuclein in the absence (black) and presence (green) of 25mg ml‘ of £. coli 
cell extract. b, Overlay of two-dimensional [*N, 'H]-NMR spectra of 50 pM [U- 
15N]-a-synuclein in the absence (black) and presence of 50 mg mI? mammalian 
MDCK-lI cell extract (blue-green). c, Overlay of two-dimensional [°N,'H]-NMR 
spectra of 50 uM [U-SN]-a-synuclein in the absence (black) and presence 


(green) of 50 mg mI mammalian HEK293 cell extract. d, Residue-resolved 
combined chemical-shift perturbations of the a-synuclein amide moieties in 
E.colicell extract (green), mammalian MDCK-II cell extract (blue) and 
mammalian HEK293 cell extract (green), all relative to aqueous buffer. Broken 
lines indicate a significance level of twos.d. from the mean. Experiments ina-c 
were done in duplicates with similar results. 
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Extended Data Fig. 6| See next page for caption. 


Extended Data Fig. 6 | LUVs and the chaperone SecB compete for a-synuclein 
binding. a, Residue-resolved backbone amide NMR signal attenuation 

(Ie = //Ip) of a-synuclein caused by the addition of 5 mg mI LUVs (125:1 molar 
ratio of lipid:protein; dark yellow) and after further addition of 2 equivalents of 
SecB (yellow). b, Residue-resolved backbone amide NMR signal attenuation 
(I,e1= I/Ip) of a-synuclein caused by the addition of 15 mg mI LUVs (375:1 molar 
ratio lipid:protein; dark yellow) and after further addition of 2 and 6 equivalents 
of SecB, respectively (yellow), measured at 298 K.c, Residue-resolved 
backbone amide NMR signal attenuation (/,.)=//Ip) of a-synuclein caused by the 


addition of 2 equivalents of SecB (yellow) and increasing amounts of LUVs with 
the following ratios: 2.5 mg mI", 62.5:1; 4.0 mg mI“, 100:1; 6.25 mg mI", 156:1; 
8.5mg m7, 212.5:1.d, Schematic showing the conformational equilibrium of 
free a-synuclein, its chaperone-bound state and one possible conformation of 
its LUV-bound state (PDB 1XQ8)”. Notably, these observations are also in full 
agreement with related studies for HSP90” and HSP27”. e, Dynamic light 
scattering (DLS) measurements of LUVs prepared from pig brain polar lipids. 
Two independent preparations are shown in blue and orange, respectively, with 
an average diameter of 110 nm. 
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Extended Data Fig. 7 | Interaction of x-synuclein and chaperones in cells. 

a, Western blot analysis of the expression of a-synuclein fused to a C-terminal 
haemagglutinin (HA)-tag in HEK293 cells. The molecular mass marker and the 
band corresponding to a-synuclein-HA are indicated. With these samples, 
immunoprecipitation and subsequent mass-spectrometry analysis was 
performed (band Fig. 2a). b, Intensity ratios of carboxy-terminally HA-tagged 
AN-a-synuclein and a-synuclein immunoprecipitation determined by relative 
quantitative mass-spectrometry analysis. Experiments were performed as 
duplicates in HEK293 cells. Identification of at least five peptides per protein 
was required for quantification. Data are mean. The dotted line represents an 
intensity ratio of 1. Proteins that belong to specific groups are highlighted in 
colours. The values for a-synuclein (green) as well as tubulin 84 and tubulin 01B 
(orange arrows from left to right) are indicated by coloured arrows. 

c, Efficiency of HSC70 knockdown in HEK293 cells (constitutively expressing 
the T-Rex repressor) stably transfected with an inducible shRNA targeting 
HSC70 mRNA (shHSC70). The image shows a representative semiquantitative 
reverse-transcription (RT)—PCR of HSC70 mRNA in cells treated with 
doxycycline to induce shHSC70 and geldanamycin (Gel) and radicicol (Rad) for 
24 h (+). Cells transfected with a control shRNA targeting firefly luciferase 
(shLUC) as well as semiquantification of an unrelated chaperone (HSP40) were 
included as negative and loading controls. d, Semiquantification of HSC70 and 
HSP90 protein levels by western blot. HEK293 cells (constitutively expressing 
the T-Rex repressor) stably transfected with shHSC70 and shLUC were grownin 
normal (-) or doxycycline-containing (+) medium for HSC70 knockdown. The 
cells were subsequently treated with vehicle (-) or geldanamycinand radicicol 
for HSP90 inhibition. The constitutively expressed protein GAPDH was assayed 
as loading control. e, Efficiency of the combined treatment of geldanamycin 
and radicicol in disrupting the a-synuclein-HSP90 interaction. HEK293 cells 


were treated with geldanamycin and radicicol for 4 or 24. handthen 
electroporated with recombinant a-synuclein using the protocol for in-cell 
NMR experiments. Whole-cell lysates were collected and usedin 
immunoprecipitation assays with anti-a-synuclein antibodies. The obtained 
precipitates were then resolved by SDS-PAGE and analysed by western blot 
using the indicated antibodies. In addition to HEK293 cells with normal levels 
of HSP90 (control cells), cells with reduced levels of HSP90 (ShHSP90) were 
used to validate the HSP90 band. f, Inhibition of both HSP90 and HSC70 
promotes aggregation of a-synuclein. The image shows a representative 
semiquantitative western blot of HSC70-depleted HEK293 cells treated with 
geldanamycin and radicicol. After 24h of treatment, the cells were subjected to 
electroporation with recombinant a-synuclein and 4h after electroporation 
the cells were collected and analysed by western blot. HMW and 14 kDa refer to 
high-molecular weight and monomeric a-synuclein species, respectively. 

g,h, Quantification of intracellular levels of HSP90 and electroporated 
a-synuclein in HEK293 cells by parallel reaction monitoring mass 
spectrometry. A standard curve (contained in the yellow boxes) using 
increasing amounts of recombinant HSP90 (g) or a-synuclein (h) enables the 
relative quantification of the intracellular protein levels. As surrogates for 
intracellular protein levels, at least four tryptic peptides of HSP90 (g) or human 
a-synuclein (h) were quantified. Targeted peptides are shownat the top of each 
plot, and at least four transitions of the y-series of the productions were 
monitored over the chromatographic separation of the peptides (different 
colours). The determined cellular concentrations of HSP90 and a-synuclein 
were 30 uM and 2.5 uM, respectively (see Supplementary Methods for details 
of this calculation). cps, counts per second. The original and uncropped gels of 
a, c-fcan be found in Supplementary Fig. 1. Western blot and PCR experiments 
(a, c-f) were done in duplicates, within similar results. 
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Extended Data Fig. 8 | Sequence-specific NMR-resonance assignments of 
a-synuclein variants. a—c, Two-dimensional [*N, 'H]-NMR spectra of 500 uM 
[U-8C, 'N]-a-synuclein (grey), 450 pM [U-3C, N]-acetyl-a-synuclein (dark 
violet) and 100 pM [U-N]-AN-a-synuclein (dark blue). The sequence-specific 
resonance assignments for wild-type as well as acetylated a-synuclein obtained 
from three-dimensional triple resonance experiments and from chemical-shift 
mapping of AN-a-synuclein are indicated. d, e, Two-dimensional [°C, °N]-NMR 
spectra of 500 pM [U-8C, 'N]-a-synuclein (grey) and 450 pM [U-8C, N]-acetyl- 
a-synuclein (dark violet). The sequence-specific resonance assignments for 
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wild-type and acetylated a-synuclein obtained from three-dimensional triple 
resonance experiments are indicated. f, Residue-resolved combined chemical- 
shift perturbations of the amide moieties for acetyl-a-synuclein (dark violet) 
and AN-a-synuclein (dark blue) versus wild-type a-synuclein. g, Residue- 
resolved combined chemical-shift difference of the carbonyl-amide moieties 
for acetyl-a-synuclein (dark violet) versus wild-type a-synuclein. [°N,!H]-NMR 
spectra ina-c were measured five times and [°C, SN]-NMR spectra (d, e) were 
measured in duplicates, all yielding similar results. 
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Extended Data Fig. 9| See next page for caption. 
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Extended Data Fig. 9 | Sequence-specific NMR-resonance assignments of 
methionine-oxidized and tyrosine-phosphorylated a-synuclein variants. 
a-c, Two-dimensional [°N, 1H]-NMR spectra of 100 pM oxidized [U-SN]-a- 
synuclein (light grey), 100 uM oxidized [U-N]-acetyl-a-synuclein (violet) and 
100 pM oxidized [U-°N]-AN-a-synuclein (blue). The sequence-specific 
resonance assignments from chemical-shift mapping and published 
assignments of the oxidized state” are indicated. Oxidized methionines are 
highlighted in red. d, Residue-resolved combined chemical-shift differences of 
the amide moieties for oxidized a-synuclein (light grey), oxidized acetyl-a- 
synuclein (violet) and oxidized-AN-a-synuclein (blue) relative to their 
respective reduced states. Colours as ina-c. Arrows indicate the positions of 


the oxidized methionines. e-g, Two-dimensional [°N,’H]-NMR spectra of 
50 uM [U-SN]-mono-phospho-a-synuclein (red-brown), 50 pM [U-SN]-tri- 
phospho-a-synuclein (brown) and 50 uM [U-N]-tetra-phospho-a-synuclein 
(dark brown). The sequence-specific resonance assignments based on 
published assignments for phosphorylated a-synuclein are indicated”. 
Phosphorylated residues are highlighted in cyan. h, Residue-resolved 
combined chemical-shift differences of the amide moieties for the 
phosphorylated a-synuclein variants relative to wild-type a-synuclein. Colours 
as ine-g. Arrows indicate the positions of the phosphorylated tyrosines. 
[SN,H]-NMR spectra of the different modified a-synuclein variants were 
measured several times (n = 4) yielding similar results. 
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Extended Data Fig. 10 |Mechanism of chaperone-controlled regulation of conformational equilibria. Impairment of the natural a-synuclein-chaperone 
a-synuclein function, conformation and localization in mammalian cells. ratio or abrogation of the a-synuclein-chaperone interaction by post- 
Cellular chaperones (yellow) interact with the N-terminal segment of translational modifications can lead to the formation of pathological species, 


a-synuclein (red), thus actively regulating its functional species by shifting including the accumulation of a-synuclein at mitochondria. 
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Data collection NMR data were collected on Bruker spectrometers operated with TOPSPIN 3.0-3.5. 


Data analysis NMR data were processed with PROSA and analyzed with CARA. MS data were analyzed with Skyline (MacCoss, Version 3.7). 
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Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size No statistical methods were used to predetermine sample size. Sample sizes were chosen in agreement with established procedures in the 
field. 


Data exclusions No data was excluded from the analyses 


Replication Experiments were replicated to ensure reproducibility of the findings. The number of independent replicates for each experiment are 
specified in the respective Figure captions. All attempts at replication were successful. 


Randomization |The experiments were not randomized, in agreement with established procedures in the field. 


Blinding The investigators were not blinded to allocation during experiments and outcome assessment, in agreement with established procedures in 
the field. 


= 
fev) 
a 
‘= 
= 
o 
= 
o 
Nn 
o 
fev) 
= 
O 
= 
= 
o 
©) 
(S) 
a 
= 
a 
Nn 
S 
3} 
5 
fev) 
5 
< 


Reporting for specific materials, systems and methods 
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system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 
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Clinical data 


Antibodies 


Antibodies used ouse anti-alpha-Synuclein Abcam Cat#: ab27766, RRID: AB_727020, 1:1000 for Western blot and Immunofluorescence 
Rabbit anti-alpha-Synuclein CellSignaling Cat#: 2642, RRID: AB_10695412, 1:1000 for Western blot and Immunofluorescence 
ouse anti-Hsc70 Abcam Cat#: ab2788, RRID: AB_303301, 1:1000 for Western blot 

ouse anti-Hsp90 Beta Abcam Cat#: ab53497, RRID: AB_881097, 1:3000 for Western blot 

Rabbit anti-COX IV ProteinTech Cat#: 11242-1-AP, RRID: AB_2085278, 1:500 for Immunofluorescence 

Rabbit anti-COX IV Abcam Cat#: ab16056, RRID: AB_443304, 1:1000 for Immunofluorescence 

ouse anti-GAPDH Thermo Fischer Scientific Cat#: GA1R, RRID: AB_10751612, 1:5000 for Western blot 


Validation ouse anti-alpha-Synuclein Abcam Cat#: ab27766, validated by Abcam 

Rabbit anti-alpha-Synuclein CellSignaling Cat#: 2642, validated by CellSignaling and by our lab 
ouse anti-Hsc70 Abcam Cat#: ab2788, validated by Abcam 

ouse anti-Hsp90 Beta Abcam Cat#: ab53497, validated by Abcam 

Rabbit anti-COX IV ProteinTech Cat#: 11242-1-AP, validated by Proteintech 

Rabbit anti-COX IV Abcam Cat#: ab16056, validated by Abcam 

ouse anti-GAPDH Thermo Fischer Scientific Cat#: GA1R, validated by Thermo Fischer Scientific 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) Flp-In™ 293 cells were purchased from Thermo Fisher Scientific (R75007), HEK-293 were purchased from the American Type 
Culture Collection (CRL-1573). 


Authentication The authenticity of the cells was provided by Thermo Fisher Scientific and the American Type Culture Collection upon 
purchase. We have not authenticated these cell lines. 


Mycoplasma contamination The cells were tested for mycoplasma contamination every four weeks. Only mycoplasma-free cultures were used for the 
experiments. 


Commonly misidentified lines No commonly misidentified cells were used. 
(See ICLAC register) 
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Conservation scientist Aerin Jacob (right) conducts field work with a colleague in British Columbia, Canada, in 2018. 


SECRETS TO WRITING 
AWINNING GRANT 


Experienced scientists reveal how to avoid application 
pitfalls to submit successful proposals. By Emily Sohn 


hen Kylie Ball begins a grant- 

writing workshop, she often 

alludes to the funding suc- 

cesses and failures that she has 
experienced in her career. “I 

say, ‘I’ve attracted more than $25 million in 
grant funding and have had more than 60 
competitive grants funded. But I’ve also had 

probably twice as many rejected. A lot of 
early-career researchers often find those rejec- 
tions really tough to take. But I actually think 
you learn so much from the rejected grants.” 
Grant writing is a job requirement for 
research scientists who need to fund projects 
year after year. Most proposals end in rejection, 


but missteps give researchers a chance to learn 
how to find other opportunities, write better 
proposals and navigate the system. Taking 
time to learn from the setbacks and successes 
of others can help to increase the chances of 
securing funds, says Ball, who runs workshops 
alongside her role as a behavioural scientist 
at Deakin University in Melbourne, Australia. 


Do your research 

Competition for grants has never been 
more intense. The European Commission’s 
Horizon 2020 programme is the European 
Union’s largest-ever research and innova- 
tion programme, with nearly €80 billion 


© 2020 Springer Nature Limited. All rights reserved. 


(US$89 billion) in funding set aside between 
2014 and 2020. It reported a 14% success rate 
for its first 100 calls for proposals, although 
submissions to some categories had lower suc- 
cess rates. The commission has published its 
proposal for Horizon Europe, the €100-billion 
programme that will succeed Horizon 2020. In 
Australia, since 2017, the National Health and 
Medical Research Council has been funding 
less than 20% of proposals it receives. And 
the US National Science Foundation (NSF) 
received 49,415 proposals and funded 11,447 
of themin 2017 — less than 25%. That’s tens of 
thousands of rejections in a single year from 
the NSF alone. 
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Work / Careers 


Being a renowned scientist doesn’t ensure 
success. On the same day that molecular 
biologist Carol Greider won a Nobel prize in 
2009, she learnt that her recently submitted 
grant proposal had been rejected. “Even onthe 
day when you win the Nobel prize,” she saidina 
2017 graduation speech at Cold Spring Harbor 
Laboratory in New York, “sceptics may question 
whether you really know what you're doing.” 

To increase the likelihood of funding 
success, scientists suggest doing an exten- 
sive search of available grants and noting 
differences in the types of project financed 
by various funding bodies. Government agen- 
cies such as the NSF tend to be interested in 
basic science that addresses big, conceptual 
questions, says Leslie Rissler, programme 
director at the NSF’s Division of Environmen- 
tal Biology in Alexandria, Virginia. A private 
foundation, however, might prioritize pro- 
jects that inform social change or that have 
practical implications that fit into one of its 
specific missions. 


Pitching a proposal 

Before beginning an application, you should 
read descriptions and directions care- 
fully, advises Ball, who recently pored over 
200 pages of online material before starting 
a proposal. That effort can save timeintheend, 
helping researchers to work out which awards 
area good fit and which aren't. “If you're not 
absolutely spot on with what they’re looking 
for, it may not be worth your time in writing 
that grant,” she says. 

Experienced scientists suggest studying suc- 
cessful proposals, which can often be acquired 
from trusted colleagues and supervisors, uni- 
versity libraries or online databases. A website 
called Open Grants, for example, includes more 
than 200 grants, bothsuccessful and unsuccess- 
ful, that are free to peruse. 

Grant writers shouldn't fear e-mailing or 
calling a grants agency to talk through their 
potential interest ina project, advises Amanda 
Stanley, executive director at COMPASS, anon- 
profit organization based in Portland, Oregon, 
that supports environmental scientists. For 
six years, she worked as a programme officer 
for the Wilburforce Foundation in Seattle, 
Washington, which supports conservation 
science. At this and other private foundations, 
the application process often begins witha ‘soft 
pitch’ that presents a brief case for the pro- 
ject. Those pitches should cover several main 
points, Stanley says: “‘Here’s what I’m trying 
to do. Here’s why it’s important. Here’s a little 
bit about me and the people I’m collaborating 
with. Would you like to talk further?” She notes 
that a successful proposal must closely align 
witha foundation’s strategic goals. 

Each organization has its own process, but 
next steps typically include a phone conver- 
sation, a written summary and, finally, an 
invitation to submit a formal application. 
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“Once you've gotten that invitation to submit 
a proposal from the programme officer, your 
chances of getting funded are really, really 
high,” Stanley says. 


The write stuff 


Applicants should put themselves inthe shoes 
of grant reviewers, who might need to read doz- 
ens of applications about complicated subjects 
that lie outside their own fields of expertise, 
often while juggling their own research. 
“Imagine you're tired, grumpy and hungry. 
You’ve got 50 applications to get through,” 
says Cheryl Smythe, international grants 
manager at the Babraham Institute, a life- 
sciences research institution in Cambridge, 
UK. “Think about how youas an applicant can 
make it as easy as possible for them.” 
Formatting is an important consideration, 
says Aerin Jacob, aconservation scientist at the 
Yellowstone to Yukon Conservation Initiative 
in Canmore, Canada. White space and bold 
headings can make proposals easier to read, 
as canillustrations. “Students are tempted and 
sometimes encouraged to squeeze in as much 
information as possible, so there are all kinds 
of tricks to fiddle with the margin size, or to 
make the font a little bit smaller so that you 
can squeeze in that one last sentence,’ Jacob 
says. “For areviewer, that’s exhausting to read.” 
Ball advises avoiding basic deal-breakers, 
suchas spelling errors, grammatical slips and 
lengthy proposals that exceed word limits. 
Those kinds of mistake can cast doubt on how 
rigorous applicants will bein their research, she 
says. A list of key words, crucial for indexes and 
search engines, should be more than an after- 
thought, Ball adds. Ona proposal fora project 


on promoting physical activity among women, 
she tagged her proposal with the word ‘women’. 
The descriptor was too broad, and her applica- 
tion ended up witha reviewer whose expertise 
appeared to bein sociology and gender studies 
instead of in exercise or nutrition. The grant 
didn’t score well in that round of review. 

To prevent a reviewer's eyes from glazing 
over, Jacob says, use clear language instead of 
multisyllabic jargon. When technical details are 
necessary, follow up a complex sentence with 
onethat sums up the big picture. Thinking back 
toher early proposals, Jacob remembers cram- 
ming in words instead of getting to the point. 
“It was probably something like, ‘I propose to 
study the heterogeneity of forest landscapes 
inspatialand temporal recovery after multiple 
disturbances, rather than, ‘I want to see what 
happens when a forest has been logged, burnt 
and farmed, and grows back,” she says. 

Grants can be more speculative and more 
self-promotional than papers are, Rissler adds. 
“A grant is about convincing a jury that your 
ideas are worthy and exciting,” she says. “You 
can make some pretty sweeping generaliza- 
tions about what your proposed ideas might 
do for science and society in the long run. A 
paper is much more rigid in terms of what you 
can say and in what you must say.” 

Getting some science communication 
training can be a worthwhile strategy for 
strengthening grant-writing skills, Stanley 
says. When she was reviewing pitch letters for 
a private foundation, she recalls that lots of 
scientists couldn’t fully explain why their work 
mattered. But when she received pitches that 
were clear and compelling, she was more will- 
ing to help those scientists brainstorm other 


Grants manager Cheryl Smythe (left) allows for IT glitches when submitting grant proposals. 
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possible funding agencies if her foundation 
wasn’t the right fit. Scientists who sent strong 
— albeit unsuccessful — applications were also 
more likely to get funding from the foundation 
for later projects. 


Science storytelling 


To refine project pitches and proposals, 
Stanley recommends that scientists use a free 
communication tool from COMPASS called 
the Message Box Workbook, which can help 
to identify key points and answer the crucial 
question for every audience: ‘So what?’ Scien- 
tific conferences often provide symposia or ses- 
sions that include funders and offer helpful tips 
for writing grants. And development officers 
at institutions can help scientists to connect 
with funders. “A good development officer is 
worth their weight in gold,” Stanley says. “Make 
friends with them.” 

Jacob has taken science-communication 
training through COMPASS, The Story Collider 
(ascience-storytelling organization) and from 
other such organizations. She has learnt how 
to talk about her work in the manner of a 
storyteller. In proposals and interviews, she 
nowincludes personal details, when relevant, 
that explain the problems she wants to address 
and why she decided to speak out about con- 
servation — an example of the kind of conflict 
and resolution that builds a good story. Jacob 
senses that the approach strikes a chord. “Asa 
reviewer, you remember somebody’s proposal 
just that little bit more,” she says. “If you have 
astack of proposals, you want to find the one 
that you connect with.” 

Aclear focus can help to boost a grant tothe 
top of areviewer’s pile, Ball adds. In one of the 
first large grants that she applied for, she pro- 
posed collecting information onthe key factors 
that prevent weight gain as well as designing 
andimplementing an obesity-intervention pro- 
gramme. Inretrospect, it was too much within 
the grant’s two-year time frame. She didn’t get 
the funding, and the feedback she received was 
that it would have worked better as two sepa- 
rate proposals. “While it’s tempting to want 
to claim that you can solve these enormous, 
challenging and complex problems ina single 
project,’ Ball says, “realistically, that’s usually 
not the case.” 

Teaming up with collaborators can also 
increase the chance of success. Earlier this 
year, Ball was funded by the Diabetes Australia 
Research Program for a study that she proposed 
incollaboration with hospital clinicians, helping 
disadvantaged people with type 2 diabetes to 
eat healthy diets. Earlier in her career, she had 
written grants based on her own ideas, rather 
than on suggestions from clinicians or other 
non-academic partners. This time, she says, she 
focused ona real-world need rather than on her 
own ideas for a study. Instead of overreaching, 
she kept the study small and preliminary, allow- 
ing her totest the approach before trying to get 


funding for larger trials. 

It is acceptable — even advisable — to admit 
astudy’s limitations instead of trying to meet 
preconceived expectations, Jacob adds. In 
2016, she had a proposal rejected for a study 
on spatial planning on the west coast of 
Canada that would, crucially, be informed by 
knowledge from Indigenous communities. She 
resubmitted the same proposal the next year to 
thesame reviewers, but with amore confident 
and transparent approach: she was straightfor- 
ward about her desire to take a different tack 
from the type of research that had been tried 
before. This time, she made it clear that she 


“Grant writers shouldn't 
fear e-mailing or calling 
agrants agency.’ 


wanted to listen to Indigenous peoples and use 
their priorities to guide her work. She got the 
funding. “Isawthat if tried to change it to meet 
what I thought funders wanted, I might not be 
accurately representing what was doing,” she 
says. “I just wanted to be really clear with myself 
and really clear with the interviewers that this is 
who lam, and this is what I want to do.” 


What not todo 


Writing is hard, and experienced grant writers 
recommend devoting plenty of time to the task. 
Smythe recommends setting aside a week for 
each page of a proposal, noting that some 
applications require only a few pages while 
major collaborative proposals for multi-year 
projects can run to more than 100 pages. “It 
can take months to get one of these together,” 
she says. 

Scheduling should include time for rewrites, 
proofreads and secondary reads by friends, 
colleagues and family members, experts say. 
Working right up to the deadline can undo 
weeks to months of hard work. At the last min- 
ute, Jacob once accidentally submitted an earlier 
draft instead of the final version. Itincluded sec- 
tions that were bolded and highlighted, with 
comments suchas, “NOTE TO SELF: MAKE THIS 
PART SOUND BETTER.’ She didn’t get that one, 
and has never made the same mistake again. 

Add an extra buffer for technology malfunc- 
tions, adds Smythe, who once got a call from 
a scientist at another organization who was 
ina panic because his computer had stopped 
working while he was trying to submit a grant 
proposal half an hour before the deadline. She 
submitted it for him with 23 seconds to spare. 
“My hand was shaking,” she says. That proposal 
was not successful, although the scientist sent 
her anice bottle of champagne afterwards. 

Grant writing doesn’t necessarily end with 
a proposal’s submission. Applicants might 
receive requests for rewrites or more informa- 
tion. Rejections can also come with feedback, 
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and if they don't, applicants can request it. 

Luiz Nunes de Oliveira, a physicist at the 
University of S40 Paulo, Brazil, also works as 
a programme coordinator at the S40 Paulo 
Research Foundation. In this role, he some- 
times meets with applicants who want to follow 
up onrejected proposals. “We sit down and go 
through their résumé, and then you find out 
that they had lots of interesting stuff to say 
about themselves and they missed the oppor- 
tunity,” he says. “Allit takes is to write an e-mail 
message asking [the funder] for aninterview.” 

Jacob recommends paying attention to such 
feedback to strengthen future proposals. To 
fund her master’s programme, she applied for 
a grant from the Natural Sciences and Engi- 
neering Research Council of Canada (NSERC), 
but didn’t get it on her first try. After request- 
ing feedback by e-mail (to an address she found 
buried on NSERC’s website), she was able to see 
her scores by category, which revealed that 
a few bad grades early in her undergraduate 
programme were her limiting factor. 

There was nothing she could do about her 
past, but the information pushed her to work 
harder on other parts of her application. After 
gaining more research and field experience, 
co-authoring a paper and establishing rela- 
tionships with senior colleagues who would 
vouch for her as referees, she finally secured 
funding from NSERC on her third try, two years 
after her first rejection. 

Negative feedback can be one of the best 
learning experiences, Rissler adds. She kept 
the worst review she ever received, a scathing 
response to a grant proposal she submitted 
to the NSF in 2003, when she was a postdoc 
studying comparative phylogeography. The 
feedback, she says, was painful to read. It 
included comments that her application was 
incomprehensible and filled with platitudes. 

After she received that letter, which is now 
crinkled up in her desk for posterity, Rissler 
called a programme officer to ask why they 
let her see sucha negative review. She was told 
that the critical commenter was an outlier and 
that the panel had gone onto recommend her 
project for the grant, which she ultimately 
received. “I learnt that you do need to be 
tough,” says Rissler, who now helps to make 
final decisions on funding for other scientists. 
She emphasizes that whereas reviewers’ opin- 
ions can vary, all proposals undergo multiple 
independent expert reviews, followed by 
panel discussions and additional oversight 
by programme directors. 

Grant writing tends to provoke anxiety 
among early-career scientists, but oppor- 
tunities exist for people who are willing to 
take the time to develop ideas and push past 
rejections and negative feedback, she says. 
“We can’t review proposals that we don’t get.” 


Emily Sohn is a freelance journalist in 
Minneapolis, Minnesota. 
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don’t like thinking alone. I like the bustle 
and energy of other people. Sol do 

my thinking ina glass-covered, grand 
atrium with gigantic windows, where I’m 
surrounded by people. I like to be able to 
think and write, then look up and see people 
talking, thinking and writing. We're all trying 
to wrestle with some problem. 

l oversee everything that happens at the 
Francis Crick Institute: from our discovery 
research to our engagement with schools, 
the local community and the public. 

I work with colleagues to develop the 
Crick’s strategy for delivering high-quality 
research that unlocks deeper understanding 
of the biology underlying human health and 
disease. That strategy includes bringing in 
the best scientific talent, and supporting the 
UK biomedical research endeavour. 

The Crick is a grand building shaped a bit 
like a cathedral. We have glass walls at the 
four ends of the central atrium. 

At the east end, some glass is treated 
witha special refractive film, so the 
colour changes depending on the angle 
between the light source and the viewer. 


© 2020 Springer Nature Limited. All rights reserved. 


It’s very beautiful. 
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I quite often live in my head, and my mind 
wanders over a range of things. 

In my laboratory, I study cells to find out 
how they work as the fundamental unit of 
life. This is difficult and complex, and I like 
an environment that I find both stimulating 
and restful at the same time. 

Sometimes I’m really disciplined, focused 
and wrestling with the problem in hand. 
Sometimes I’m daydreaming, looking 
at the rest of the world and allowing my 
mind to wander into a wider environment. 
So I vacillate between those two types of 
thinking. 

What I like about the glass in this atrium is 
that you don’t feel constrained. Ifyou sit ina 
little office, then you’re physically cramped 
and your brain is cramped, too. 

Here, looking up through the windows and 
tothe sky, my mind can expand beyond its 
normal confines. 


Paul Nurse, geneticist and cell biologist, 
is director of the Francis Crick Institute in 
London, UK. Interview by Josie Glausiusz. 


